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BACKGROUND OF THE INVENTION 



This application claims priority to and specifically incorporates by reference, the content 
of U.S. Provisional Application Serial No. 60/100,403 filed September 15, 1998 and U.S. 
5 Provisional Application Serial No. 60/102,392 filed September 29, 1998. The entire text of each 
of the above-referenced disclosures is specifically incorporated by reference herein without 
disclaimer. 

!• Field of the Invention 

10 The present invention relates generally to the fields of microbiology and immunology. 

More particularly, it concerns the response of macrophages to endotoxin exposure from Gram- 
negative bacteria and methods of detecting and treating individuals at high risk for infection by 
Gram-negative bacteria and inhibiting sepsis and septic shock. 

15 2. Description of Related Art 

"Lmate" or "natural" inmiunity is largely subserved by macrophages and NK cells. These 
cells lack the huge repertoire of pathogen receptors that comprise the afferent limb of "specific" 
immunity, e.g., the receptors found on T-cells and B-cells. Rather, they rely upon very broadly 
effective mechanisms for the recognition of invasive organisms. An important case in point 
20 concerns Gram^-negative bacteria, all of which bear endotoxin (lipopolysaccharide, LPS) 
molecules in their outer membrane, which trigger a strong immune response on the part of the 
host which produces a shock-like syndrome, characterized by low blood pressure and 
hyporeactivity to vasoconstricter agents. 

25 When macrophages are exposed to pure preparations of endotoxin, they secrete 

numerous cytokine mediators, including tumor necrosis factor (TNF, TNFa), interleukin-1 (IL- 
1), interferon-o/p, GM/CSF, IL-8 and ultimately smaller "autocoid" molecules, all of which 
mediate an intense inflammatory reaction. Endotoxin recognition acts as an early warning signal 
through which a host may mount a timely defense against invasion by Gram-negative organisms. 
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However, widespread activation of macrophages by endotoxin results in the development of 
septic shock. By most estimates, Gram-negative septic shock is responsible for 100,000 deaths 
per year in the United States alone. The entire syndrome of hypotension, coagulopathy, 
pulmonary edema and acute renal failure results, in large part, from the release TNF and other 
5 cytokines in response to exposure to endotoxin. 

TNF is probably the most important mediator of local inflammatory processes as well as 
septic endotoxin shock. It is produced quickly and in large amounts by macrophages that 
encounter endotoxin (Beutler et al, 1985a). TNF causes shock when administered to animals 
10 (Tracey et al, 1986), and blockade of TNF synthesis or activity markedly attenuates the lethal 
effect of endotoxin (Beutler et aL, 1985b). The relationship between endotoxin and TNF is 
therefore directly correlated. Surprisingly, however, little is known about the mechanism by 
which endotoxin triggers host cells to release TNF. 

15 The cognate receptor for endotoxin, complexed with lipopolysaccharide binding protein 

(LBP) in plasma, is the GPI-linked cell surface membrane protein CD 14. CD 14 effectively 
concentrates endotoxin onto the surface of macrophages and other defensive cells of the host 
(Wright et aL, 1990). However, it does not actually signal the presence of endotoxin, as CD 14 
has no cytoplasmic component with which to do so. 

20 

Endotoxin is known to trigger both tyrosine and serine phosphorylation events within the 
macrophage cell, and at least in part, ras, raf, MEK, and members of the MAP kinase family are 
also involved in signal transduction (Geppert et ah, 1994). The endpoints of endotoxin signaling 
include activation of the transcription of TNF and various genes, and activation of the translation 
25 of TNF mRNA (Beutler et al, 1986; Han et al, 1990). At the protein level, this stimulation by 
endotoxin leads to a several thousand-fold augmentation of cytokine biosynthesis by a 
macrophage cell. But the initial controlling element and event in the signaling pathway of 
macrophage response to endotoxin has not been identified. Thus, in spite of its importance, 
most of the endotoxin signaling pathway remains relatively unknown. Recently however, the 
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Toll-like receptor 2 (TLR2) has been suggested to partially mediate lipopolysaccharide-induced 
cellular signaling (Gerard, 1998; Yang etaL, 1998). 

Thirty years ago, mice of the C3H/HeJ strain were noted to be specifically and globally 
5 unresponsive to endotoxin, while closely related animals of the C3H/HeN or C3H/0uJ 
substrains exhibited normal responses (Sultzer, 1968). The median lethal dose of endotoxin is 
more than 100-fold higher in C3H/HeJ mice than in either of these other strains. Macrophages 
of C3H/HeJ mice fail to produce cytokines in response to endotoxin, and B-lymphocytes of 
C3H/HeJ mice are not driven to proliferate by endotoxin. While C3H/HeJ mice are highly 

10 resistant to the lethal effect of endotoxin, they are unusually sensitive to infection by 
gram-negative organisms. The mean lethal inoculum with Salmonella typhimurium, for 
example, is two organisms in C3H/HeJ mice, whereas several thousand organisms are required 
to kill ncdce of the C3H/HeN strain. Hence, the ability to sense the presence of endotoxin is 
required for defense against gram-negative organisms and it is speculated that individuals that 

15 suffer from sepsis and septic shock have a similar genetic mutation which causes them to be 
more susceptible to infection. 

These defective responses by the C3H/HeJ mice are the result of a single, codominant 
mutation, which maps to a position between the widely separated Mup-1 (Major urinary protein) 

20 and Ps (polysyndactyly) loci on mouse chromosome IV (Watson et al, 1978). Mice 
homozygous for the mutant allele of the '^Lps gene" are unresponsive to endotoxin, whereas 
homozygotes for the common allele are normally responsive, whether lethality or cell-based 
assays are employed as an index. Heterozygotes exhibit intermediate levels of response. The 
protein encoded by this mutant gene is the most important known determinant of 

25 endotoxin-induced TNF biosynthesis, and indeed, of all reactions to endotoxin. 

Many attempts to identify the product of the hps gene, or to clone it have been made. 
With the recognition that CD 14 serves as the principle cell-surface receptor for endotoxin, it was 
proposed that the Lps gene might encode an associated polypeptide chain with signal 
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transducing potential, or more broadly, an early component of the signal transduction apparatus. 
Attempts to identify a CD 14 binding molecule, which might be the product of the hps gene, 
have been pursued by several investigators. Two-hybrid screening, affinity chromatography, and 
cross linking approaches have thus far each failed to pinpoint a protein that specifically engages 
5 CD 14. Expression cloning strategies have also been applied in the search for the Lps gene 
product without success. 

Given the occurrence of gram-negative bacteremia and the high and rising incidence of 
gram-negative nosocomial infections, a certain subset of the population appears to be at high 

10 risk to develop endotoxic shock even if adequate antimicrobial therapy is instituted. It would be 
useful to know which patients are at high risk for gram-negative bacterial infections and sepsis 
in advance of its onset. Diagnostic methods that predict the risk of infection as well as the 
clinical course of sepsis could be reasonably applied to most hospitalized patients. In addition, it 
is clear that there is an inmiediate and increasing need for new drugs and treatment methods that 

15 regulate macrophage response to gram-negative bacterial infections. Unfortunately, given the 
current lack of understanding of the regulation of the macrophage response to endotoxin, these 
drugs and methods have not been developed, and patients continue to be at risk for these life- 
threatening infections. 

20 SUMMARY OF THE INVENTION 

The present invention relates in part to methods for screening for susceptibility to 
infection. These methods are based on the Inventors' discovery that the Toll-4 or TLR-4 
polypeptide plays a role as the LPS receptor and is, therefore, involved in the pathway leading to 
25 inmiune responses in response to certain infections, including especially, certain infection 
involving Gram negative bacteria. In particular embodiments, the invention provides screening 
methods for identifying individuals at risk for certain infections. Testing positive for such 
screens would permit proactive counseling and/or treatment of susceptible individuals. For 
example, in particular circumstances an individual may be taking immunosuppressive drugs or 
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be immunodeficient. It would be advantageous in this, and in other instances, for the individual 
susceptible to infection, to be apprised of the risk to Gram negative bacterial infection. Recently 
the nomenclature for the Toll-4 protein has been changed to TLR-4 (Toll-like receptor 4). Thus, 
in the context of the present invention it is important to note that Toll-4 and TLR-4 are used 
5 interchangeably. The new nomenclature will be used herein, unless such designation leads to 
ambiguity in certain textual embodiments. 

Such methods for screening for the susceptibility to infection generally comprise: 
obtaining sample nucleic acid from an animal; and analyzing the sample nucleic acid to detect a 
mutation in a gene encoding a TLR-4 polypeptide relative to a sequence of a gene encoding a 
native TLR-4 polypeptide; wherein a mutation in the gene encoding the TLR-4 polypeptide is 
indicative of susceptibility to infection, fii most cases, the nucleic acid analyzed is DNA, and 
the step of analyzing the TLR-4-encoding nucleic acid comprises sequencing the TLR-4- 
encoding nucleic acid to obtain a sequence. In order to determine whether a mutation exists in 
the obtained sequence of the TLR-4-encoding nucleic acid, the sequence may be compared to a 
native nucleic acid sequence of TLR-4. For example, the native nucleic acid sequence of TLR-4 
may have a sequence set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:46, 
SEQ ID NO:47 or SEQ ID NO:48. la another example, the native TLR-4 polypeptide may have 
an amino acid sequence set forth in SEQ ID N0:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO: 
98 or SEQ ID NO:99. Of course, nay methods of determining whether a mutation is present in 
the gene encoding the TLR-4 polypeptide is within the scope of the invention. 

In cases where the TLR-4-encoding nucleic acid comprises a mutation, that mutation 
may be a point mutation, or any other type of mutation. The step of analyzing the TLR-4- 
25 encoding nucleic acid may comprise PCR, an RNase protection assay, or an RFLP procedure. 
Alternatively, an antibody that discriminates wild-type TLR-4 from mutant TLR-4 nucleic acid 
or TLR-4 polypeptide may be used in an immunodetection format or the TLR-4 polypeptide may 
be directly sequenced. 

-6- 

A: 225917(4%BH01! DOC) 



It is contemplated that the mutation in the TLR-4 encoding nucleic acid may arise 
through deletion mutations, insertion mutations, frameshift mutations, nonsense mutations, 
missense mutations or splice mutations. In particularly preferred embodiments, the sample 
nucleic aid comprises a mutation that results in a change from PRO to HIS at residue 712 of a 
5 TLR-4 protein of SEQ ID NO:99. In other preferred embodiments, the sample nucleic acid 
comprises a mutation that results in a change from GLU to LYS at residue 178, a change from 
ARG to HIS at residue 763, a change from GLN to ARG at residue 188, a change from ASP to 
GLY at residue 299, a change from ASN to SER at residue 329, a change from GLU to LYS at 
residue 474, a change from ARG to HIS at residue 763, a change from TYR to CYS at residue 

10 46, a change from PRO to HIS at residue 145, a change from CYS to TYR at residue 281, a 
change from ASN to HIS at residue 624, or a change from THR to ILE at residue 399 of the 
native TLR-4 polypeptide amino acid sequence of SEQ ID NO:98. In yet other embodiments, 
the sample nucleic acid comprises at least a second mutation, wherein the second mutation 
results in a deletion of VAL-GLY-THR at residues 827-829 of the native TLR-4 polypeptide 

15 amino acid sequence of SEQ ID NO:98. 

In particular embodiments of the invention, the sample nucleic acid comprises at least 
one point mutation relative to a nucleic acid sequence from a gene encoding a native TLR-4 
polypeptide, wherein the mutation is in nucleotide 2342 of the nucleic acid sequence of SEQ ID 
20 NO:46. In further embodiments, the sample nucleic acid comprises at least two point mutations 
relative to a nucleic acid sequence from a gene encoding a native TLR-4 polypeptide, wherein at 
least one mutation is a change from nucleotide C to nucleotide A at position 2342 of the nucleic 
acid sequence of SEQ ID NO:46. 

25 In other cases, at least one mutation is in Exon 2, Exon 3 or Intron 2 of the sequence of 

SEQ ID NO:47, wherein said at least one mutation is a change from nucleotide A to nucleotide 
G at position 8457, a change from nucleotide G to nucleotide A at position 8612, a change from 
nucleotide A to nucleotide G at position 8631, a change from nucleotide A to nucleotide G at 
position 12245, a change from nucleotide T to nucleotide C at position 12293, a change from 
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nucleotide C to nucleotide A at position 12412, a change from nucleotide C to nucleotide A at 
position 12413, a change from nucleotide A to nucleotide G at position 12541, a change from 
nucleotide G to nucleotide A at position 12820, a change from nucleotide A to nucleotide G at 
position 12874, a change from nucleotide A to nucleotide G at position 12964, a change from 
5 nucleotide C to nucleotide T at position 13174, a change from nucleotide G to nucleotide A at 
position 13398, a change from nucleotide G to nucleotide A at position 13769, a change from 
nucleotide A to nucleotide C at position 13848, a change from nucleotide G to nucleotide A at 
position 13937, or a change from nucleotide G to nucleotide A at position 114266 of the 
sequence of SEQ JD NO:47. In another embodiment, at least one mutation is a deletion of 

10 nucleotide T at position 12228 of the sequence of SEQ ID NO:47. In a preferred embodiment, at 
least one mutation is a change from nucleotide A to nucleotide G at position 12245 of the gene 
sequence and a deletion of nucleotides 14453 to 14461 of the sequence of SEQ ID NO:47. In 
other embodiments, the sample nucleic acid sequence comprises at least two mutations relative 
to the sequence of SEQ ID NO:47, wherein at least two mutations comprise a change from 

15 nucleotide C to nucleotide T at position 12399 and a change from nucleotide G to nucleotide A 
at position 12510, a change from nucleotide C to nucleotide A at position 12413 and a change 
from nucleotide G to nucleotide A at position 14266, or a change from nucleotide A to 
nucleotide G at position 12874 and a change from nucleotide C to nucleotide T at position 
13174 of the sequence of SEQ ID NO:47. 

20 

In other embodiments, the present invention relates to methods of reducing susceptibility 
of an animal to infection comprising the step of modulating an LPS mediated response in the 
animal. In most animals, a mutation or other defect can cause the animal to be unable to mount 
an appropriate response in the presence of an infectious agent, for example, a gram negative 
25 bacteria. These methods often comprise diagnosing an animal with an infection or one 
susceptible to infection via analysis of a TLR-4-encoding nucleic acid sequence for a mutation 
relative to a sequence of a gene encoding a native TLR-4 polypeptide, wherein the native TLR-4 
polypeptide is a TLR-4 polypeptide that has the sequence of SEQ ID NO:2, SEQ ID N0:4, SEQ 
ID N0:6, SEQ ID NO:98 or SEQ ID NO:99, In some cases, modulating LPS receptor function 
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comprises providing a TLR-4 polypeptide to the animal. The TLR-4 polypeptide is a native 
TLR-4 polypeptide, for example, one have the sequence of SEQ ID N0:2, SEQ ID N0:4, SEQ 
ID NO:6, SEQ ID NO:98 or SEQ ID NO:99. Alternatively, the TLR-4 polypeptide may be a 
modified TLR-4 polypeptide created by molecular biological methods. In some cases, the 
5 provision of a TLR-4 polypeptide is accompUshed by inducing expression of a TLR-4 
polypeptide. For example, the expression of a TLR-4 polypeptide encoded in the animal's 
genome may be induced. Alternatively, the expression of a TLR-4 polypeptide encoded by a 
nucleic acid provided to the animal is induced. In other cases, the provision of a TLR-4 
polypeptide is accomplished by a method comprising introduction of a TLR-4-encoding nucleic 

10 acid to the animal. In some embodiments, the provision of a TLR-4 polypeptide is accomplished 
by injecting a TLR-4 polypeptide into the animal. In yet other embodiments, a TLR-4 
polypeptide is provided as a single chain antibody construct and delivered via adenovirus. The 
invention also relates to the inventor's discovery that certain mutants of TLR-4 fail to bind LPS, 
and as a result, the host having such mutations are rendered more susceptible to the bacterial 

15 infection. 

The invention further contemplates methods of modulating an LPS mediated response 
comprising modulating TLR-4 function in an animal. Such methods often comprise the step of 
diagnosing the animal via analysis of a TLR-4-encoding nucleic acid sequence for a mutation. 
20 This modulation can be accomplished by providing a TLR-4 polypeptide to the animal in any 
manner discussed above. Alternatively, the modulating TLR-4 function in the animal comprises 
providing a modulator of TLR-4 to the animal. 

The process of modulating an LPS mediated response in the animal may comprise 
25 providing a modulator of TLR-4 to the animal. As used herein, a "modulator of TLR-4" is any 
substance that affects the functioning of TLR-4 in the LPS pathway. For example, the 
modulator of TLR-4 may be an agonist or antagonist of TLR-4. The modulator of may TLR-4 
modulate the transcription and/or translation of a TLR-4-encoding nucleic acid. 
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In some cases, the methods of reducing susceptibility to an infection involve diagnosing 
an animal with susceptibility to infection via analysis of an TLR-4-encoding nucleic acid 
sequence for a mutation, in any of the manners discussed above. 

5 Additional aspects of the invention relate to methods of screening for modulators of an 

LPS mediated response comprising the steps of: a) obtaining a TLR-4 polypeptide; b) 
detemaining a standard activity profile of the TLR-4 polypeptide; c) contacting the TLR-4 
polypeptide with a putative modulator; and d) assaying for a change in the standard activity 
profile. In these cases, the TLR-4 polypeptide may have the amino acid sequence of SEQ ID 
10 NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:98 or SEQ ID NO:99. The standard activity 
profile of the TLR-4 polypeptide is determined by determining the ability of the TLR-4 
polypeptide to stimulate transcription of a reporter gene, the reporter gene operatively positioned 
under control of a nucleic acid segment comprising a promoter from a TLR-4 gene 

15 In other embodiments, a method of modulating an LPS mediated response comprising 

modulating TLR-4 function in an animal is provided, often further comprising the step of 
diagnosing the animal for susceptibility to infection via analysis of a TLR-4-encoding nucleic 
acid sequence for a mutation relative to a sequence of a gene encoding a native TLR-4 
polypeptide. In certain embodiments, the animal susceptible to infection is provided a TLR-4 

20 polypeptide, wherein the TLR-4 polypeptide is a TLR-4 polypeptide that has the sequence of 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:98 or SEQ ID NO:99. In specific 
embodiments, the provision of a TLR-4 polypeptide is accomplished by inducing expression of a 
TLR-4 polypeptide in the animal, introduction of a TLR-4-encoding nucleic acid to the animal 
or by injecting a TLR-4 polypeptide into the animal. In other embodiments, the step of 

25 modulating TLR-4 function in the animal comprises providing a modulator of TLR-4 to the 
animal, wherein the modulator of TLR-4 may be an agonist of TLR-4 or antagonist of TLR-4, 
wherein the agonist of TLR-4 or antagonist of TLR-4 modulates transcription or translation of a 
TLR-4-encoding nucleic acid. 
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In some preferred embodiments, the invention contemplates methods of screening for 
modulators of an LPS mediated response comprising: a) obtaining a TLR-4-encoding nucleic 
acid segment; b) determining a standard transcription and translation activity of the TLR-4- 
encoding nucleic acid sequence; c) contacting the TLR-4-encoding nucleic acid segment with 
5 a putative modulator; d) maintaining the nucleic acid segment and putative modulator under 
conditions that normally allow for TLR-4 transcription and translation; and e) assaying for a 
change in the transcription and translation activity of TLR-4. 

Yet other embodiments relate to modulators of an LPS mediated response prepared by a 
10 process comprising screening for modulators of an LPS mediated response comprising: a) 
obtaining a TLR-4 polypeptide; b) determining a standard activity profile of the TLR-4 
polypeptide; c) contacting the TLR-4 polypeptide with a putative modulator; and d) assaying 
for a change in the standard activity profile. Such modulators may be prepared by a process 
comprising screening for modulators of an LPS mediated response comprising: a) obtaining a 
15 TLR-4-encoding nucleic acid segment; b) determining a standard transcription and translation 
activity of the TLR-4 nucleic acid sequence; c) contacting the TLR-4-encoding nucleic acid 
segment with a putative modulator; d) maintaining the nucleic acid segment and putative 
modulator under conditions that normally allow for TLR-4 transcription and translation; and e) 
assaying for a change in the transcription and translation activity. 

20 

The invention further relates to methods of treating Gram-negative bacterial infections 
comprising administration of an agent that modulates the recognition of endotoxin through an 
LPS mediated response. For example, the agent may simulate or inhibit the activity of a TLR-4 
polypeptide. 

25 

Other objects, features and advantages of the present invention will become apparent 
from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various changes and modifications within the spirit 
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and scope of the invention will become apparent to those skilled in the art from this detailed 
description. 

Other objects, features and advantages of the present invention will become apparent 
5 from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various changes and modifications within the spirit 
and scope of the invention will become apparent to those skilled in the art from this detailed 
description. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings form part of the present specification and are included to further 
demonstrate certain aspects of the present invention. The invention may be better understood by 
15 reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

FIG. 1. Coarse genetic map of the Lps locus with respect to D4MIT markers. 493 

meioses from an (SWR X C3H/HeJ) X C3H/HeJ backcross were examined, and crossovers used 
20 to map the locus with respect to eleven markers, including the D4MIT markers shown and a 
single marker derived from the interferon-a locus. Numbers above double arrows indicate 
centimorgan distances. The gene was confined to a region between a cluster of four inseparable 
markers (D4MrT244, 218,82, and 325) and the single marker D4MIT80. 

25 FIG. 2A and FIG- 2B. Identification of novel microsatellite markers^ and mapping 

of the Lps locus with respect to these markers. FIG. 2A, Map based on 493 meioses analyzed 
in the SWR backcross. FIG. 2B, Map based on 1600 meioses analyzed in the C57BL/6 
backcross. Scale refers to physical distances (note 1 Mb bar), determined on the basis of 
fluorescence in situ hybridization data and pulse-field gel electrophoresis, performed using 
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YACs and BACs that span the critical region (FIG. 4 and FIG. 5). Though only one of the 
markers (7.3) could be used in both analyses, the maps shown in FIG. 2A and FIG. 2B are 
perfectly aligned and drawn to the same scale. Interruptions in map FIG. 2B indicate that 
D4Mm64 and D4MIT27 are quite remote from the region of interest. Numerals above 
5 brackets refer to the number of crossovers within each mapping panel that separate individual 
polymorphic markers from the Lp/ mutation. D4MIT markers are shown in shaded circles; 
novel markers defined in Table I are placed in the correct physical order (centromere is on the 
left), and indicated by non-shaded circles. The heavily shaded bar coinciding with the zero 
region of each map refers to an area in which no crossover events were observed, between any of 
10 the markers themselves, or between the markers and Lp/. The location of the Lps"^ mutation is 
necessarily bounded by markers B and 83.3, as indicated by the heavy bar at the bottom of the 
Figure. 

FIG. 3A and FIG. 3B. Primary biological assay data used in defining crossovers 
15 between Lp/ and B (FIG. 3A), and between Lps"^ and 83.3 (FIG. 3B). Assays of LPS 
response were performed on seven separate days. On each of these days, control assays of TNF 
production by macrophages (FIG. 3A) or splenocyte proliferation (FIG. 3B) from obligate 
heterozygotes and obligate homozygotes for the Lps^ allele were performed in parallel with 
assays performed on samples from the seven mice known to have recombination in the interval 
20 between B and 83.3. Results of assays performed on obligate heterozygotes are shown as blue 
circles; obligate homozygote results are shown as red circles. The LPS responses of the seven 
mice with a recombination event between markers B and 83.3 are represented as black squares. 
Four of these mice (FIG. 3A) showed recombination between B and Lps^ , and three mice 
(FIG. 3B) showed recombination occurred between Lp/ and 83.3. No overlap was apparent 
25 between the range of values obtained with obligate homozygotes and obligate heterozygotes, and 
the mice bearing recombination could be unambiguously categorized as responders or 
nonresponders. Each point represents the mean of four replicate cytotoxicity assays (FIG. 3A) or 
splenocyte proliferation assays (FIG. 3B). Error bars are omitted for simplicity, but were 
generally in the range of 10% of the mean value. 
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FIG. 4 A and FIG. 4B. FIG. 4A: Centromere is to the left. Sixty-three BACs are 
included in this contig. A gap of small size (probably less than 100 kb) separates BACs 24 and 
25. The remainder of the contig is seamless. The apparent gap between BACs 58 and 59 is 

5 covered by YAC clone 100E4. Vertical lines indicate the ends of selected BACs contacting 
SP6 or T7 primer sites within the BAC cloning vector. Tic marks indicate specific unique 
markers listed above the contig. Microsatellite markers are given alphabetical or numerical 
designations. D4Mrr markers (325, 25, 178, 7, 132, and 83) are shown in large type. Circled 
microsatellite markers were polymorphic with respect to C3H/HeJ and SWR (magenta) or 

10 C3H/HeJ and C57BL/6 (orange). Double arrows at the top of the Figure indicate physical 
distances estimated by interphase FISH analysis (numbers indicate megabases; margin of error= 
±10%). Vertical lines separating arrows point to the center of the BACs used for FISH distance 
measurements (L22, 297012, 259N13 346B6, 217B22, 215K4, and 293L15). All BACs are 
drawn to scale, based on sizes obtained through pulsed-field gel electrophoresis. 

15 

FIG. 4B.' Bars indicate genetic distances assigned by means of two independent 
baclc-crossesV A backcross involving SWR mice yielded 493 meaningful meioses. Three 
crossovers were observed between marker 83.3 and marker 7.11, corresponding to a genetic 
distance of 0.6 cM, and one crossover was observed between marker D4MIT325 and marker 

20 25.5, corresponding to a genetic distance of 0.2 cM (magenta bars). A backcross involving 
C57BL/6 mice yielded 1600 meaningfiil meioses. Four crossovers were observed between 
marker B and marker A, corresponding to a genetic distance of 0.25 cM and 13 crossovers were 
observed between marker C and marker B, corresponding to a genetic distance to 0.8 cM 
(orange bar). On 2093 meioses, the mutation is therefore confined to an interval between 

25 markers B and 83.3. This corresponds to a physical distance of 2.6 Mb (denoted by the blue 
bar). Complete absence of crossovers (the zero region) was observed in the composite of the 
two crosses over a 1.2 Mb interval extending from marker A though marker 7.1 1 (denoted by the 
black bar). No polymorphic markers capable of distinguishing C57BL/6 from C3H/HeJ were 
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identified distal to marker 7.11 in the critical region. The BACs represented in this Figure are 
as follows: 

(1. 4L22 2. 329E1 3. 331E22 4. 18J9 5. 259B3 6. 147P7 7. 179M4 8, 297012 9. 
5 363L11 10. 353J12 11. 15108 12. 312J8 13. 358P4 14. 327021 15. 297N10 16. 92G10 17. 
259N13 18. 243020 19. 216C14 20. 131M6 2L49K20 22. 135017 23. 274K20 24. 336A11 
25. 309117 26. 152C16 27. 352P10 28. 58H7 29. 84C8 30. 346B6 31. 373118 32. 288K23 33. 
291G16 34.27608 35.340116 36.269E13 37. 62A9 38. 389F15 39. 353021 40. 197M3 41. 
293J8 42. 220E13 43. 181N19 44. 369F7 45. 430N20 46. 370J14 47. 213015 48. 265H22 49. 
10 20B5 50. 175118 51. 247P7 52. 264N15 53. 204O1 54. 217B22 55. 178D24 56. 300H9 57. 
188A22 58. 289J1 1 59. 152B3 60. 288020 61. 216K4 62. 293L15 63. 147M3) 

All BAG designations refer to the Research Genetics mouse BAG library, with the 
exception of BACs 84C8 and 389F15, which were obtained from Genome Systems (mouse 
C57BL/6BAC library). 

FIG. 5. Minimal contig of BACs and a single YAC, sequenced in the search for Lps. 

BACs sequenced to completion or to the point of finishing are shown in blue ("complete"). 
BACs sequenced to a high density, but not yet at the point of finishing, are shown in red ("in 
progress"). YAC clone 100E4 has also been partially sequenced. BACs 293L15 and 147M3 
have not been sequenced. Bar at the top of the figure refers to genetic distances from limiting 
markers B and 83.3 to Lps^. The zero recombination area is shown in black. Several of the 
polymorphic markers used in mapping (circled) are included as landmarks. The positions of the 
pseudogenes detected are shown at BAC-level resolution; sequences from TLR-4, considered 
the prime candidate gene, were detected in BAG 309117 and in BAG 152C16. 

HG. 6. TLR-4 receptor locus and a portion of the Pappa locus. The two genes were 
identified by GRAIL and BLAST analysis. The orientation and location of TLR-4 and Pappa 
are shown with respect to the nearest genetic markers, 

-15- 

A: 2259I7(4%BH0P.DOC) 



FIG. 7A. Amino acid sequences of mouse mutant J-Toll-4, mouse N-Toll-4, rat 
TLR-4 and human TLR-4. The mutant mouse J-toU TLR-4 amino acid sequence contains a 
point mutation at residue 712 (proline to histidine), not found in the amino acid sequences of N- 
5 Toll-4, rat TLR-4 or human TLR-4. The numbering system in this figure does not take into 
account the spacing to maximize the sequence alignment, 

FIG. 7B. Sequencing of amplified DNA from the C3H/HeJ, C3H/HeN, SWR, 
C57BL/6, and DBA-2 mice genomes. 

10 

FIG. 8A. Amplification of low abundance control cDNAs from HeN, HeJ, ScSn, 
ScCr, HeN, HeJ, ScSn and ScCr. 

FIG. 8B. TLR-4 mRNA detected by Northern blots using total RNA from 
15 macrophages. 

FIG. 9. TLR-4 mRNA induced by LPS in lymphoid tissues. 

FIG. 10. Spatially conserved representation of coding mutations found in TLR4 of 
102 human subjects. Each lightly shaded dot refers to a mutation affecting a residue that is 

20 relatively conserved (either invariant among six manamalian species, or extant in two forms). 
Each black dot refers to a less conserved residue (three or more variants among species). 
Twelve individuals were found to be heterozygous for a double amino acid substitution. One 
individual was found to be heterozygous at only one of the mutant sites. Introns are shortened, 
and non-coding regions are not shown, but the coding region of the three principal human exons 

25 is drawn to an equivalent scale at all points in the illustration. 
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FIG. 11. Spatially conserved representation of coding mutations found in Tlr4 of 35 

Mus musculus strains. As in FIG. 10, each lightly shaded dot refers to a mutation that is 
relatively conserved, and each black dot refers to a less conserved mutation. 

5 FIG. 12. Genetic distance and probable ancestral relationships among Tlr4 genes of 

35 Mus musculus strains. Numbers within circles denote strains, in accordance with the legend 
of Table IV. Numbers within parentheses indicate the mutational distance (number of mutations 
separating each strain from its presumed ancestor), with reference to both coding and non- 
coding substitutions listed in Table IV. Arrows point in the direction of descent, and in length 
10 are proportionate to distance. Dashed arrows suggest that mice of a given genotype evidently 
contributed genetic information to mice of another strain, given the similarity of the mutations 
observed, though unique mutations are also observed in the latter latter strain, and not all of the 
mutations observed in the former strain are present. The symbol denotes the likelihood of 
an intermediate form prior to interbreeding of strains. 

15 

FIG. 13. Spline curve illustrating interspecific sequence variation across the Tlr4 
protein. A multiple alignment of Tlr4 sequences from three rodent species (mouse, rat, and 
hamster) and three primate species (human, chimpanzee, and baboon) was generated using the 
GCG program pileup. The number of amino acids observed at each residue was plotted using 
20 the program Prism 3.0 (a value of 1 was assigned if a single amino acid was observed among the 
six species; a value of 5 was assigned if five forms were observed among the six species; etc). 
The points were then connected using a cubic spline curve. Ecto, the extracellular domain; TM, 
the transmembrane domain; Prox, the proximal cytoplasmic domain; Dist, the distal cytoplasmic 
domain. Numbering refers to the human sequence. Where a deletion was introduced by pileup, 
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a single mismatch was assumed. Where the sequence was truncated, each absent residue was 
tabulated as a separate mismatch. 

FIG. 14. Schematic illustration of recombinant proteins expressed in RAW 264.7 
5 cells. Constructs were made by PGR, using cDNA derived from C3H/HeJ and C3H/HeN mice. 
The primers (5*_3'): 

ATC GAT AGG AGO AGG GTT GAA TGG G 
and 

TAT GGA TAG GAG GAA GGT TGA ATG GG 
10 were used to generate the full-length amplified products, which were cloned into the 

vector pFLAG-GMV-1 (Sigma) using Glal and Kpnl sites. The native signal peptide was thus 
removed, and an altemative signal peptide, followed by the flag sequence, was provided by the 
vector. The ectodomain construct was produced using the downstream primer (5'_3'): 
GAG GGT AGG TGA GAG GTG AAA ATA GAA GTG GTA T, 
15 whereas the two cytoplasmic domain constructs were produced using the upstream 

primer (5'_3'): 

GGG GAA TTG AAT GTA GAA GAG AAT GAT GAG T. 

The latter two constructs were cloned into pFLAG-GMV-1 using EcoRI and Kpnl sites. All 
constructs were verified by DNA sequencing on both strands. All expression constructs were 
20 shown to yield products of anticipated size in GOS cells, after Western blot detection with M2 
monoclonal antibody (not shown). 

FIG. ISA. Saturation isotherm of monoclonal antibody M2 binding to a single 
clone of RAW 264.7 cells transfected with an expression vector encoding Tlr4^^^ *^. M2 

25 antibody was labeled to a specific activity of 3.0 x 10 cpm/» g using I, by means of the 
iodogen technique^^. Labeled antibody (concentration range 50 ng/ml to 20 • g/ml) was added 
in a volume of LO ml to monolayers of 2.0 x 10^ cells in Hank's balanced salt solution, 
supplemented with 10% fetal bovine serum and buffered with 50 mM HEPES, pH 7.4. The cells 
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were maintained at 0_ C for a period of 4 hours. Performed as such, 56.4% saturation of the 
surface receptor was achieved at equilibrium using the highest concentration of antibody. 
Nonlinear regression analysis, based on the assumption of hyperbolic binding kinetics, suggested 
a Kd of 1.0 X 10"^ A 3.0 x 10'^ M"^ and the presence of 2.81 x 10"^ A 4.99 x 10^ binding sites per 
5 cell. for the analysis was 0.9576. Controls, performed for all plates, included the addition of 
the flag peptide to the system at a 10 • M concentration in order to block specific binding, and 
only specific binding (in general, approximately 80% of total binding) is presented here. 
Mathematical analysis was performed using using the program Prism 3.0 (GraphPad Software 
Inc). 

10 

FIG. 15B. The expression level of each construct (copy number per cell) was 
determined by direct measurement of equilibrium M2 monoclonal antibody binding at a fixed 
concentration, with reference to the nonlinear regression analysis of saturation isotherms 
presented in FIG 15 A. Each point represents the result of duplicate determinations of specific 
15 binding. Cells transfected with vector alone had zero specific binding (not shown). 

FIG. 15C. Composite analysis of the shift in EC50 caused by expression of 
rpi^^Lps-n Tlr4^P^" ^ in RAW 264.7 cells. Each point represents an EC50 determination 
performed on ten independent clones at the stated LPS concentration. For each clone, four 

20 replicate assays were performed at each concentration. Hence, each curve represents the 
composite analysis of 240 assays. Transfected macrophages were plated in 24 well plates at a 
density of 5 x 10^ cells per well, and covered with 1.0 ml of DMEM supplemented with 10% 
FBS. LPS was added to each well at the concentration indicated. After 15 hours of incubation 
the medium was harvested, and TNF concentration was assayed in the standard L-929 

25 cytotoxicity system^^, using cycloheximide at a concentration of 100 • g/ml to potentiate kiUing. 
After 15 hours of exposure to diluted macrophage medium or to mouse TNF applied at a range 
of 8 standard concentrations, the number of viable cells was determined by staining with crystal 
violet. A standard curve relating % cytotoxicity to TNF concentration (not shown) was 
generated using Prism 3.0, and was based on an assumption of sigmoidicity. Variance among 
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replicate samples was typically beneath 5% of the mean, and departures of individual samples 
from the sigmoid plot were typically smaller still. Linear estimates of TNF concentration in 
unknown samples were based on non-linear interpolation from the standard curve (also 
performed using Prism 3.0). The curves shown were generated assuming sigmoidicity of 
5 response with variable slope for each curve, and further assigning a maximum response value of 
1760 pg TNF, which yielded an optimal fit for the most responsive curve. Error bars indicate 
standard deviation among clones at each LPS concentration. Log EC50 values and standard 
error values are shown in the inset table. 

10 FIG* 15D. Shifts of the log EC50, determined for individual clones. A more 

conservative approach, in which log EC50 values were determined for individual clones 
transfected with vector alone, or with the TM^'""" or Tlr4^P'"'^ constructs. The log EC50 data 
were then represented as a scatter plot, in which each point represents the log EC50 
determination of a single clone (confidence limits not shown for the sake of simplicity), and is 

15 derived from 24 separate TNF assays performed on samples stimulated over a lO'^-fold range of 
LPS concentration (100 pg/ml to 1 • g/ml, as well as an unstimulated control). In most 
instances, two determinations of log EC50 were made independently for each clone. The mean 
and standard error of each cluster is shown in the inset, together with p values defining the 
likelihood that the EC50 displacements are attributable to chance. Data were analyzed by means 

20 of a one-tailed t test, using Welch's correction for unequal variance. 

FIG. 15E. Lack of correlation between level of recombinant receptor expression 
and the magnitude of effect on EC50. Plotting the EC50 of Tlr4^P''" transfected cells (nine 
clones; duplicate assays) and Tlr4^'"^ transfected cells (eight clones; duplicate assays) vs. the 
25 receptor number measured for each clone (shown separately in Figures 2b and 2d), it is apparent 
that no correlation exists over the range of receptor number surveyed. Inset: p values calculated 
to assess the significance of departure of the slope (determined by linear regression for each set 
of points) from zero. 
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FIG. 16A. Over-expression of the Tlr4 ectodomain fails to inhibit LPS signaling. 

Composite analysis of the shift in EC50 caused by expression of the Tlr4 ectodomain (no 
cytoplasmic domain) or the full-length Tlr4^P'''' protein in RAW 264.7 cells. Control cells were 
5 transfected with the empty vector. Analysis was carried out in a manner identical to that 
described in Figure 2c. Inset: log EC50 for each curve, and standard error. 

FIG. 16B. Distribution of individual log EC50 data for clones expressing the Tlr4 
ecotodomain, the full-length Tlr4^P' ** protein, or no recombinant receptor (transfected 
10 with empty vector). Inset: the mean and standard error of each cluster, together with p values 
defining the likelihood that the EC50 displacements are attributable to chance. Data were 
analyzed by means of a t test, using Welch's correction for unequal variance. 

FIG. 17 A. Over-expression of the Tlr4 cytoplasmic domain, either with or without 
15 the Lps-d mutation, impairs signal transduction in RAW 264.7 cells. Composite analysis of 
the shift in EC50 caused by expression of the Tlr4 ectodomain (no cytoplasmic domain) or the 
full-length Tlr4^P'"'* protein in RAW 264.7 cells. Control cells were transfected with the empty 
vector. Analysis was carried out in a manner identical to that described in Figure 2c. Inset: log 
EC50 for each curve, and standard error. 

20 

FIG. 17B. Distribution of individual log EC50 data for clones expressing the 
cytoplasmic domain of the Tlr4^P"* protein, the Tlr4^P'"' protein, or no recombinant 
protein (transfected with empty vector). Inset: the mean and standard error of each cluster, 
together with p values defining the likelihood that the EC50 displacements are attributable to 
25 chance. Data were analyzed by means of a t test, using Welch's correction for unequal variance. 
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DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 



Bacterial infections represent a significant challenge in the treatments of a wide variety 
of disease. Structurally disparate LPS molecules, produced by many different species of Gram- 
5 negative bacteria, are engaged on the macrophage surface by CD 14 and ultimately trigger the 
release of cytokines such as tumor necrosis factor (INF) and interleukin-1 (IL-1). These 
cytokines orchestrate the inflanmiatory response, with its attendant beneficial and harmful 
effects. 

10 Given the ubiquitous presence of Gram-negative bacteria and the high incidence of 

Gram-negative mediated infections, certain individuals are at high risk to develop endotoxic 
shock even if adequate antimicrobial therapy is instituted. Methods for diagnosing patients at 
high risk for Gram-negative bacterial infections and sepsis in advance of its onset would be 
beneficial. Further, there is a need to determine whether a particular individual may be 

15 susceptible to Gram-negative bacterial infection, in determining the course of treatment for any 
number of diseases. For example, it will be beneficial to test individuals who are candidates for 
immuno-suppressive drag therapy, for susceptibility to Gram-negative bacteria in order to assess 
the risks of immunosuppressive treatment. Such diagnostic methods to predict the risk of 
infection as well as the clinical course of sepsis could be reasonably applied to most hospitalized 

20 patients. The present invention is directed towards addressing these needs. 

The claimed invention also provides methods of identifying agents which stimulate 
endotoxin signaUng such that TNF and other cytokines are released from macrophages. It is 
envisioned that such agents will be of therapeutic use in the treatment of individuals who are 
25 insensitive to endotoxin, either through genetic defect, disease or other condition, and whose 
immune system requires external stimulation to recognize infection from Gram-negative 
bacteria. 
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A. The Present Invention 

In broad aspects, the present invention provides methods for screening for susceptibility 
to infection. On the basis of 2093 meioses analyzed in two separate intraspecific backcrosses, 
the location of the mouse Lps"^ mutation has been circumscribed to a genetic interval 0.9 cM in 

5 size. To identify gene candidates, nearly 40,000 sequencing runs were performed across the 
critical region. Selective hybridization and exon trapping were also employed to identify genes 
throughout the "zero" region. These studies revealed that only a single intact gene was 
identified within the entire critical region. This gene encodes the TLR-4 receptor, a member of 
the IL-1 family of receptors. Thus, the present inventors demonstrate that there is a mutation in 

10 the TLR-4-encoding gene that appears to provide a predisposition to infection. 

In these studies a total of 19 genetic markers that lie in close proximity to the mutation 
were examined in mapping. Most of these were previously unpublished polymorphic 
microsatellites, identified by fragmentation of YAC and BAG clones spanning the region of 

15 interest. Lp/ was found to be inseparable from the microsatellite marker D4Mm78, and from 
three novel polymorphic microsatellites identified near D4Mm78. The mutation was confined 
between two novel microsatellite markers, herein designated "B" and "83.3." B lies centromeric 
to the mutation, and was separated by four crossovers in a panel of 1600 mice; 83.3 lies distal to 
the mutation and was separated by three crossovers in a panel of 493 mice. 66 BAG clones and 

20 one YAG clone were assembled to cover >95% of the critical region. Estimates based on 
pulsed field gel electrophoresis and fluorescence in situ hybridization indicate that the B^ 83.3 
interval is about 3.2 Mb in length. A minimal area of zero recombinational distance from Lp/ 
was also assigned, and found to occupy approximately 1.2 Mb of physical size. 

25 It was found that in the macrophages of mice that are susceptible to bacterial infection 

there is a genetic mutation in the Ips locus. Specifically, there is a mutation in the TLR-4 
receptor that is expressed by the macrophages of these compromised mice and this mutation 
leads to a reduced recognition of endotoxin. As the endotoxin is not recognized by these 
defense cells, there is a lack of immune response mounted against the invading bacteria which 

-23- 

A: 225917(4%BH01! DOC) 



results in the deleterious effect of the infection. The present invention suggest that similar 
mechanisms work in other manmiaUan cells and as such in a broad sense the present invention 
provides methods of preventing a bacterial infection of a host comprising ensuring that the 
macrophages of the host express a function TLR-4 or Toll-like receptor. By providing such a 
5 functional receptor, the present invention ensures that the endotoxin signal is recognized by the 
inmiune system of the host. Conversely, in those instances in which widespread activation of 
macrophages by endotoxin results in the overproduction of TNF leading to the development of 
septic shock, it may be desirable to down-regulate the TLR-4 receptor. 

10 Thus, the present invention also provides a method of treating gram-negative bacterial 

infections comprising administration of an agent that stimulates the recognition of endotoxin. In 
specific circumstances such an agent is Ukely to by a candidate substance that stimulates the 
expression, activity or function of the TLR-4 receptor that is expressed by the macrophages of 
the host. Similarly, methods and compositions for treating endotoxin related symptoms 

15 comprising administration of an agent that reduces the recognition of endotoxin also are 
provided by the present invention. A useful compound that may be identified by the present 
invention is one which inhibits endotoxin signaling by binding to a TLR-4 receptor and 
competing with endotoxin for the binding. Particularly preferred agents would be those that 
modulate the stimulation of TNF and cytokine secretion. Such a modulation may be an increase 

20 in secretion in circumstances where the endotoxin has not been recognized or a decrease in 
secretion in circumstances where there has been a deleterious production of TNF and/or other 
cytokines. 

Thus, as outlined above and described in detail herein below, the TLR-4 sequence will 
25 find utility in a variety of apphcations in bacterial infection susceptibility detection, diagnosis, 
prognosis and treatment. Examples of such applications within the scope of the present invention 
include amplification of markers of LPS mediated infections using specific primers; detection of 
markers of TLR-4 by hybridization with oligonucleotide probes; incorporation of isolated nucleic 
acids into vectors and expression of vector-incorporated nucleic acids as RNA and protein; 

-24- 

A: 225917(4%BH01'.DOC) 



development of immunologic reagents corresponding to gene encoded products; and therapeutic 
treatment for the identified infection using these reagents as well as, anti-sense nucleic acids, or 
other inhibitors specific for the identified disease. The present invention further discloses 
screening assays for compounds to upregulate gene expression or to combat the effects of the 
5 mutant TLR-4 genes, 

B. LPS Mutation is Responsible for Susceptibility to Bacterial Infection 

The hps is an important susceptibility locus, influencing the propensity to develop a 
disseminated Gram negative infection, or the outcome of such an infection. Hence, C3H/HeJ 

10 mice, while highly resistant to LPS, show exaggerated susceptibility to infection by Gram 
negative organisms (O'Brien et aL, 1980; Macela et al, 1996). In birds, resistance to Salmonella 
typhimurium is linked to a polymorphism at the tenascin locus (Hu et al, 1997); tenascin is 
closely Unked to the hps gene, and it may be assumed that a mutation of the avian hps locus lies 
in linkage disequilibrium with the tenascin marker, yielding the reported association. From 

15 these investigations it is likely that mutations at the equivalent locus in humans also will 
influence the course of Gram negative infection. 

The hps critical region is remarkably gene-poor. While the average megabase of 
mammalian DNA contains approximately 30 genes, only one authentic gene (and a portion of a 

20 second gene) have been detected within 2.6 Mb of DNA flanking hps. As virtually all of the hps 
critical region was sequenced herein, and no other plausible candidates were found, it must be 
considered that the lone candidate, encoding the toll-like receptor 4 (TLR-4; Tlr-4; TIL4) of 
mice, very likely represents the Lps locus. Moreover, the inventors suggest that a specific 
mutation of this gene is responsible for the endotoxin-unresponsive phenotype witnessed in 

25 C3H/HeJ mice. 

The toll family of receptors (Chaudhary et a/., 1998; Rock et a/., 1998) is defined by 
homology to the Drosophila toll protein, a plasma membrane receptor which engages an 
extracellular mediator encoded by spdzle, leading to activation of a rel gene family member, by 
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inducing its dissociation from cactus. This sequence of events is important for induction of the 
drosomycin antifungal response in Drosophila (Rosetto et al, 1995; Lemaitre et al, 1996). The 
mammalian IL-1 receptor is a member of the toll family of proteins, and four other mammalian 
family members (Toll-like receptors 1 through 4) have been identified by molecular cloning, 
5 though their function is uncertain. IL-1 signaling involves, among other events, the activation of 
NF-kB, which like dorsal, is a member of the rel family. LPS signaling also entails activation 
of NF-kB. As such, it is plausible to consider that LPS signaling might involve transduction via 
a toll family member. The present invention, for the first tune shows that TLR-4 is the receptor 
for LPS. 

10 

Further evidence consistent with this hypothesis may be seen in the clinical observation 
of Kuhns and coworkers, who determined that a profound immunodeficiency results from a 
conjoint defect in responses to IL-1 and LPS (Kuhns et al, 1997). In view of the inventor's 
findings, the fact that a single mutation may block signal transduction initiated by both II^l and 
15 LPS may be taken to indicate that a common (proximal) mediator serves both the IL-1 receptor 
and the LPS receptor. This, in turn, would suggest the existence of structural similarity between 
the E^l and LPS receptors. While the details of signal transduction via toll family members 
have not been fully elucidated, the involvement of MyD88, IRAK, and TRAF6 has recently been 
proposed in the case of the TLR-4 receptor (Muzio et al, 1998). 

20 

The Lp/ mutation has a codominant character, and attempts to identify the product of 
hps through expression cDNA cloning in C3H/HeJ macrophages were unsuccessful. Blockade 
of endotoxin signal transduction in the C3H/HeJ mice may therefore reflect the expression of a 
protein with dominant negative characteristics. As CD 14 serves as the physical receptor for LPS 
25 on the cell surface yet lacks a transmembrane domain, it would seem likely that CD 14 engages 
TLR-4, and that the latter protein acts to transduce the LPS signal across the membrane. It is 
possible that the Lp/ mutation leads to unproductive interaction between CD 14 and an TLR-4, 
preventing signal transduction through other components of the signaling pathway. 
Alternatively, the mutation may merely abolish signal transduction through TLR-4 itself. 
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Examination of these hypotheses will depend upon the demonstration of a mutational difference 
between the TLR-4 gene in C3H/HeJ mice and in C3H/HeN animals. 

TLR-4 Polypeptides 

5 TLR-4 may be obtained according to various standard methodologies that are known to 

those of skill in the art. For example, antibodies specific for TLR-4 may be used in 
immunoaffmity protocols to isolate TLR-4 from cells. Antibodies are advantageously bound to 
supports, such as columns or beads, and the immobilized antibodies can be used to pull the 
TLR-4 target out of the cell lysate. Size fractionation (chromatography, centrifugation), ion 

10 exchange or affinity chromatograph, and even gel purification may be used for purification as 
well. 

TLR-4, according to the present invention, may advantageously be cleaved into fragments 
for use in further structural or functional analysis, or in the generation of reagents such as TLR-4- 
15 related polypeptides and TLR-4-specific antibodies. This can be accomplished by treating purified 
or unpurified TLR-4 with a peptidase such as endoproteinase glu-C (Boehringer, IndianapoUs, IN). 
Treatment with CNBr is another method by which TLR-4 fragments may be produced from natural 
TLR-4. Recombinant techniques also can be used to produce specific fragments of TLR-4. 

20 In addition to the peptidyl compounds described herein, the inventors also contemplate that 

other sterically similar compounds may be formulated to mimdc the key portions of the peptide 
structure, called peptidomimetics. Mimetics are peptide-containing molecules which mimic 
elements of protein secondary structure. See, for example, Johnson et al (1993). The underlying 
rationale behind the use of peptide mimetics is that the peptide backbone of proteins exists chiefly 

25 to orient amino acid side chains in such a way as to facilitate molecular interactions, such as those 
of receptor and ligand. 

Successful applications of the peptide mimetic concept have thus far focused on mimetics 
of p-tums within proteins. Likely p-tum structures within TLR-4 can be predicted by computer- 
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based algorithms as discussed above. Once the component amino acids of the turn are determined, 
mimetics can be constructed to achieve a similar spatial orientation of the essential elements of the 
amino acid side chains, as discussed in Johnson et al. (1993). 

5 D. DNA Segments 

Important aspects of the present invention concem isolated DNA segments and 
recombinant vectors encoding TLR-4, and the creation and use of recombinant host cells 
through the application of DNA technology, that express TLR-4 for the purposes of increasing 
the recognition of endotoxin by a host cell. The present invention shows that the long sought 
10 after gene at the Ips locus is TLR-4. The TLR-4 a receptor for LPS and is thus the protein that 
binds to and recognizes LPS from the bacteria cell wall. It is recognition of the endotoxin by the 
TLR-4 that allows an animal to mount an inmiune response against the invading bacteria. If the 

0 TLR-4 peptide is mutated such that it is unable to recognize the LPS the animal will not be able 
M to mount an inomune response. 

S 15 

9 TLR-4 is a member of the IL-1 family of receptors. U. S. Patent 5,786,331; U. S. Patent 

1 5,776,731; U. S. Patent 5,767,234; U. S. Patent 5,767,064; U. S. Patent 5,726,148; U. S. Patent 
;=l 5,608,035; U. S. Patent 5,508,262; U. S. Patent 5,488,032; U. S. Patent 5,464,937 each 

specifically incorporated herein by reference, describe the JL-\ receptor and methods and 
in 20 compositions related to modulating the activity thereof. The present invention contemplates 
% using techniques and compositions similar to those described in these patents for use with the 

TLR-4 receptor of the present invention. 

DNA segments, recombinant vectors, recombinant host cells and expression methods 
25 using sequences of the human TLR-4 (SEQ ID NO:l, SEQ ID N0:3 and SEQ ID NO:47), rat 
(SEQ ID NO:5) and mouse (SEQ ID NO:45, SEQ ID NO:46 and SEQ ID NO:48) also are 
provided. These sequences express human polypeptides of SEQ ID NO:2, SEQ ID NO:4 and 
SEQ ID NO:98, rat polypeptide of SEQ ID N0:6, and mouse polypeptide of SEQ ID NO:99, 
respectively. Each of the foregoing genes are included within all aspects of the following 

-28- 



A: 225917(4%BH01!.DOC) 



description. The present invention concerns DNA segments, isolatable from mammalian and 
human cells, that are free from total genomic DNA and that are capable of expressing a 
functional TLR-4 protein. As used herein, the term "DNA segment" refers to a DNA molecule 
that has been isolated free of other genomic DNA of a particular species. Therefore, a DNA 
5 segment encoding a TLR-4 protein refers to a DNA segment that contains TLR-4 protein coding 
sequences yet is isolated away from, or purified free from, total mammalian or human genomic 
DNA. Included within the term "DNA segment", are DNA segments and smaller fragments of 
such segments, and also recombinant vectors, including, for example, plasmids, cosmids, phage, 
viruses, and the like. 

10 

Similarly, a DNA segment comprising an isolated or purified TLR-4 protein or subunit 
gene refers to a DNA segment including purified TLR-4 protein or subunit protein coding 

Jo sequences and, in certain aspects, regulatory sequences, isolated substantially away from other 
naturally occurring genes or protein encoding sequences. In this respect, the term "gene" is used 

ffl 15 for simplicity to refer to a functional protein, polypeptide or peptide encoding unit. As will be 

ii understood by those in the art, this functional term includes both genomic sequences, 

complementary DNA (cDNA) sequences and smaller engineered gene segments that express, or 

O may be adapted to express, TLR-4 proteins, polypeptides, domains, peptides, fusion proteins and 

M mutants. 

S 20 

m "Isolated substantially away from other coding sequences" means that the gene of 

interest, in this case a TLR-4 protein gene, forms the significant part of the coding region of the 
DNA segment, and that the DNA segment does not contain large portions of naturally-occurring 
coding DNA, such as large chromosomal fragments or other fiinctional genes or cDNA coding 
25 regions. Of course, this refers to the DNA segment as originally isolated, and does not exclude 
genes or coding regions later added to the segment by the hand of man. 

In particular embodiments, the invention concerns isolated DNA segments and 
recombinant vectors incorporating DNA sequences that encode a TLR-4 protein or subunit that 
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includes within its amino acid sequence a contiguous amino acid sequence in accordance with, 
or essentially as set forth in, SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:98 or 
SEQ ID NO:99 corresponding to TLR-4 in humans, rat and mouse, respectively. Moreover, in 
other particular embodiments, the invention concerns isolated DNA segments and recombinant 

5 vectors that encode a TLR-4 protein or subunit that includes within its amino acid sequence the 
substantially full length protein sequence of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ 
ID NO:98 or SEQ JD NO:99. 

The term "a sequence essentially as set forth in SEQ ID NO:2, SEQ ID N0:4, SEQ ID 
N0:6, SEQ ID NO:98 or SEQ ID NO:99 " means that the sequence substantially corresponds to 

10 a portion of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:98 or SEQ ID NO:99 and 
has relatively few amino acids that are not identical to, or a biologically functional equivalent of, 
the amino acids of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:98 or SEQ ID 
NO:99. The term "biologically functional equivalent" is weU understood in the art and is further 
defined in detail herein. Accordingly, sequences that have between about 85% and about 90%; 

15 or more preferably, between about 91% and about 95%; or even more preferably, between about 
96% and about 99%; of amino acids that are identical or functionally equivalent to the amino 
acids of SEQ ID NO:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:98 or SEQ ID NO:99 will be 
sequences that are "essentiaUy as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ 
ID NO:98 or SEQ ID NO:99 provided the biological activity of the protein is maintained. 

20 

In certain other embodiments, the invention concerns isolated DNA segments and 
recombinant vectors that include within their sequence a nucleic acid sequence essentially as set 
forth in SEQ ID N0:1, SEQ ID N0:3, SEQ ID N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID 
NO:47 and SEQ ID NO:48. The term "essentially as set forth in SEQ ID NO:l, SEQ ID NO:3, 
25 SEQ ID N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 and SEQ ID NO:48" is used in 
the same sense as described above and means that the nucleic acid sequence substantially 
corresponds to a portion of SEQ ID NO:l, SEQ ID NO:3, SEQ ID N0:5, SEQ ID NO:45, SEQ 
ID NO:46, SEQ ID NO:47 or SEQ ID NO:48 and has relatively few codons that are not 
identical, or functionally equivalent, to the codons of SEQ ID NO:l, SEQ ID N0:3, SEQ ID 
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N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48. Again, DNA 
segments that encode TLR-4 or related proteins or subunits will be most preferred. 

The term "functionally equivalent codon" is used herein to refer to codons that encode 
the same amino acid, such as the six codons for arginine or serine, and also refers to codons that 
encode biologically equivalent amino acids (see Codon Table, below). 
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CODON TABLE 



Amino Acids 


Codons 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


UGC 


UGU 










Aspartic acid 


Asp 


D 


GAG 


GAU 










Glutamic acid 


Glu 


E 


GAA 


GAG 










Phenylalanine 


Phe 


F 


UUC 


UUU 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGU 






Histidine 


His 


H 


CAC 


CAU 










Isoleucine 


He 


I 


AUA 


AUC 


AUU 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


CUG 


CUU 


Methionine 


Met 


M 


AUG 












Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGU 


Serine 


Ser 


S 


AGC AGU 


UCA 


UCC 


UCG 


UCU 


Threonine 


Thr 


T 


ACA 


ACC 


AGG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Tip 


w 


UGG 












Tyrosine 


Tyr 


Y 


UAC 


UAU 











It will also be understood that amino acid and nucleic acid sequences may include 
additional residues, such as additional N- or C-terminal amino acids or 5* or 3' sequences, and 
yet still be essentially as set forth in one of the sequences disclosed herein, so long as the 
sequence meets the criteria set forth above, including the maintenance of biological protein 
10 activity where protein expression is concemed. The addition of terminal sequences particularly 
appUes to nucleic acid sequences that may, for example, include various non-coding sequences 
flanking either of the 5' or 3' portions of the coding region or may include various internal 
sequences, i.e., introns, which are known to occur within genes. 

15 Excepting intronic or flanking regions, and allowing for the degeneracy of the genetic 

code, sequences that have between about 75% and about 79%; or more preferably, between 
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about 80% and about 89%; or even more preferably, between about 90% and about 99%; of 
nucleotides that are identical to the nucleotides of SEQ ID NO:l, SEQ ID N0:3, SEQ ID N0:5, 
SEQ ID NO:45 or SEQ ID NO:46 will be sequences that are "essentially as set forth in SEQ ID 
NO:U SEQ ID N0:3, SEQ ID N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ 
5 ID NO:48." 

Sequences that are essentially the same as those set forth in SEQ ID N0:1, SEQ ID 
N0:3, SEQ ID NO:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48 may 
also be functionally defined as sequences that are capable of hybridizing to a nucleic acid 
10 segment containing the complement of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:45, SEQ ID NO:46, SEQ ED NO:47 or SEQ ID NO:48 under relatively stringent conditions. 
Suitable relatively stringent hybridization conditions will be well known to those of skill in the 
art. 

15 Suitable standard hybridization conditions for the present invention include, for example, 

hybridization in 50% formamide, 5x Denhardts' solution, 5x SSC, 25 mM sodium phosphate, 
0.1% SDS and 100 |ag/ml of denatured salmon sperm DNA at 42°C for 16 hours followed by 1 
hour sequential washes with O.lx SSC, 0.1% SDS solution at 60°C to remove the desired 
amount of background signal. Lower stringency hybridization conditions for the present 

20 invention include, for example, hybridization in 35% formamide, 5x Denhardts' solution, 5x 
SSC, 25 mM sodium phosphate, 0.1% SDS and 100 |ag/ml denatured salmon sperm DNA or E, 
coli DNA at 42°C for 16 hours followed by sequential washes with 0.8x SSC, 0.1% SDS at 
55°C. Those of skill in the art will recognize that conditions can be readily adjusted to obtain 
the desired level of stringency. 

25 

Naturally, the present invention also encompasses DNA segments that are 
complementary, or essentially complementary, to the sequence set forth in SEQ ID NO:l, SEQ 
ID NO:3, SEQ ID N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48. 
Nucleic acid sequences that are "complementary" are those that are capable of base-pairing 
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according to the standard Watson-Crick complementarity rules. As used herein, the term 
"complementary sequences" means nucleic acid sequences that are substantially complementary, 
as may be assessed by the same nucleotide comparison set forth above, or as defined as being 
capable of hybridizing to the nucleic acid segment of SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
5 N0:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48, under relatively 
stringent conditions such as those described immediately above. 

The nucleic acid segments of the present invention, regardless of the length of the coding 
sequence itself, may be combined with other DNA sequences, such as promoters, 
polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding 
segments, and the like, such that their overall length may vary considerably. It is therefore 
contemplated that a nucleic acid fragment of almost any length may be employed, with the total 
length preferably being limited by the ease of preparation and use in the intended recombinant 
DNA protocol. 

For example, nucleic acid fragments may be prepared that include a short contiguous 
stretch identical to or complementary to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID 
NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48, such as about 22-27 or about 32-36 
nucleotides, and that are up to about 30,000 or 20,000, or about 10,000, or about 5,000 base 
pairs in length, with segments of about 3,000 being preferred in certain cases. DNA segments 
with total lengths of about 1,000, about 500, about 200, about 100 and about 50 base pairs in 
length (including all intermediate lengths) are also contemplated to be useful. 

It will be readily understood that "intermediate lengths," in these contexts, means any 
25 length between the quoted ranges, such as 22, 23, 24, 25, 26, 27, 28, 29, etc; 30, 31, 32, 33, 34, 
etc.; 50, 51, 52, 53, etc.; 100, 101, 102, 103, etc.; 150, 151, 152, 153, etc.; including all integers 
through the 200-500; 500-1,000; 1,000-2,000; 2,000-3,000; 3,000-5,000; 5,000-10,000 ranges, 
up to and including sequences of about 12,001, 12,002, 13,001, 13,002, 15,000, 20,000, 30,000 
and the like. 
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The various probes and primers designed around the disclosed nucleotide sequences of the 
present invention may be of any length. By assigning numeric values to a sequence, for example, 
the first residue is 1, the second residue is 2, eta, an algorithm defining all primers can be 
5 proposed: 

n to n + y 

where n is an integer from 1 to the last number of the sequence and y is the length of the primer 
10 minus one, where n + y does not exceed the last number of the sequence. Thus, for a 25-mer, 
the probes correspond to bases 1 to 25, 2 to 26, 3 to 27 ... and so on. For a 30-mer, the probes 
correspond to bases 1 to 30, 2 to 31, 3 to 32 ... and so on. For a 35-mer, the probes correspond 
to bases 1 to 35, 2 to 36, 3 to 37 ... and so on. 

15 It will also be understood that this invention is not limited to the particular nucleic acid 

and amino acid sequences of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:45, SEQ 
ID NO:46, SEQ ID NO:47 or SEQ ID NO:48. Recombinant vectors and isolated DNA segments 
may therefore variously include these coding regions themselves, coding regions bearing 
selected alterations or modifications in the basic coding region, or they may encode larger 

20 polypeptides that nevertheless include such coding regions or may encode biologically 
functional equivalent proteins or peptides that have variant amino acids sequences. 

The DNA segments of the present invention encompass biologically functional 
equivalent TLR-4 proteins. Such sequences may arise as a consequence of codon redundancy 
25 and functional equivalency that are known to occur naturally within nucleic acid sequences and 
the proteins thus encoded. Alternatively, functionally equivalent proteins or peptides may be 
created via the application of recombinant DNA technology, in which changes in the protein 
structure may be engineered, based on considerations of the properties of the amino acids being 
exchanged. Changes designed by man may be introduced through the application of site- 
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directed mutagenesis techniques, e.g., to introduce improvements to the antigenicity of the 
protein or to test mutants in order to examine transcription, elongation or Tat binding activity at 
the molecular level. 

5 One may also prepare fusion proteins and peptides, e.g., where the TLR-4 protein coding 

regions are aligned within the same expression unit with other proteins or peptides having 
desired functions, such as for purification or immunodetection purposes {e.g., proteins that may 
be purified by affinity chromatography and enzyme label coding regions, respectively). 

10 Encompassed by the invention are DNA segments encoding relatively small peptides, 

such as, for example, peptides of from about 15 to about 50 amino acids in length, and more 
preferably, of from about 15 to about 30 amino acids in length; and also larger polypeptides up 
to and including proteins corresponding to the full-length sequences set forth in SEQ ID N0:2, 
SEQ ID N0:4, SEQ ID NO:6, SEQ ID NO:98 and SEQ ID NO:99 6. 

15 

It is proposed that the DNA segments of the present invention may be employed for a 
variety of applications. For example, a particularly useful application concerns the recombinant 
production of the individual subunits or proteins or peptides whose structure is derived from that 
of the subunits, or in the recombinant production of the holoenzyme following co-expression of 
20 the two subunits. Additionally, the TLR-4 -encoding DNA segments of the present invention 
can also be used in the preparation of nucleic acid probes or primers, which can, for example, be 
used in the identification and cloning of TLR-4 genes or related genomic sequences, or in the 
study of subunit(s) expression, and the like. 

25 E. Inamunologic Detection Methods 

In one embodiment, the diagnostic approach will be immunologic. The reagents will 
include antibodies to the TLR-4 and TLR-4 mutants, or fragments thereof, and will further include 
reagents capable of detecting an antibody immunoreactive with an such compound. Detection 
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methods include, but are not limited to ELISA, RIA and immunoblots, as discussed elsewhere in 
the specification. 

Antibodies against TLR-4 and TLR-4 mutants isolated using the methodology described 
5 will be useful in the present invention, primarily in assays for the detection of of individuals 
suspectible to Gram-negative infection. Means for preparing and characterizing antibodies are 
well known in the art (see, e.g., Antibodies: A Laboratory Manual, Cold Spring Harbor 
Laboratory, 1988; incorporated herein by reference). 

10 The methods for generating monoclonal antibodies (MAbs) generally begin along the 

same lines as those for preparing polyclonal antibodies. Briefly, a polyclonal antibody is 
prepared by inamunizing an animal with an immunogenic composition in accordance with the 
present invention and collecting antisera from that immunized animal. A wide range of animal 
species can be used for the production of antisera. Typically the animal used for production of 

15 anti-antisera is a rabbit, a mouse, a rat, a hamster, a guinea pig or a goat. 

Immunogenic compositions of the invention include TLR-4, TLR-4 mutants or fragments 
and the like. As is well known in the art, a given composition may vary in its inomunogenicity. 
It is often necessary therefore to boost the host immune system, as may be achieved by coupling 

20 a peptide or polypeptide inmiunogen to a carrier. Exemplary and preferred carriers are keyhole 
limpet hemocyanin (KLH) and bovine serum albumin (BSA). Other albumins such as 
ovalbumin, mouse serum albumin or rabbit serum albumin can also be used as carriers. Means 
for conjugating a compound to a carrier protein are well known in the art and include 
glutaraldehyde, m-maleimidobenzoyl-N-hydroxysuccinimide ester, carbodiimyde and bis- 

25 biazotized benzidine. 

As is also well known in the art, the immunogenicity of a particular inamunogen 
composition can be enhanced by the use of non-specific stimulators of the immune response, 
known as adjuvants. Exemplary and preferred adjuvants include complete Freund's adjuvant (a 
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non-specific stimulator of the immune response containing killed Mycobacterium tuberculosis), 
incomplete Freund's adjuvants and aluminum hydroxide adjuvant. 

The amount of immunogen composition used in the production of polyclonal antibodies 
5 varies upon the nature of the immunogen as well as the animal used for immunization. A variety 
of routes can be used to administer the immunogen (subcutaneous, intramuscular, intradermal, 
intravenous and intraperitoneal). The production of polyclonal antibodies may be monitored by 
sampling blood of the immunized animal at various points following immunization. A second, 
booster injection, also may be given. The process of boosting and titering is repeated until a 
10 suitable titer is achieved. When a desired level of immunogenicity is obtained, the immunized 
animal can be bled and the serum isolated and stored, and/or the animal can be used to generate 
MAbs. 

MAbs may be readily prepared through use of well-known techniques, such as those 
15 exemplified in U.S. Patent 4,196,265, incorporated herein by reference. Typically, this 
technique involves immunizing a suitable animal with a selected immunogen composition, e.g., 
a purified or partially purified protein, polypeptide or peptide. The immunizing composition is 
administered in a manner effective to stimulate antibody producing cells. Rodents such as mice 
and rats are preferred animals, however, the use of rabbit, sheep frog cells is also possible. The 
20 use of rats may provide certain advantages, but mice are preferred, with the BALB/c mouse 
being most preferred as this is most routinely used and generally gives a higher percentage of 
stable fusions. 

Following immunization, somatic cells with the potential for producing antibodies, 
25 specifically B lymphocytes (B cells), are selected for use in the MAb generating protocol. These 
cells may be obtained from biopsied spleens, tonsils or lymph nodes, or from a peripheral blood 
sample. Spleen cells and peripheral blood cells are preferred, the former because they are a rich 
source of antibody-producing cells that are in the dividing plasmablast stage, and the latter 
because peripheral blood is easily accessible. Often, a panel of animals will have been 
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immunized and the spleen of animal with the highest antibody titer will be removed and the 
spleen lymphocytes obtained by homogenizing the spleen with a syringe. Typically, a spleen 

7 8 

from an immunized mouse contains approximately 5x10 to 2x10 lymphocytes. 

5 The antibody-producing B lymphocytes from the inmiunized animal are then fused with 

cells of an immortal myeloma cell, generally one of the same species as the animal that was 
immunized. Myeloma cell lines suited for use in hybridoma-producing fusion procedures 
preferably are non-antibody-producing, have high fusion efficiency, and enzyme deficiencies 
that render then incapable of growing in certain selective media which support the growth of 
10 only the desired fused cells (hybridomas). 

Any one of a number of myeloma cells may be used, as are known to those of skill in the 
art (Campbell, pp. 75-83, 1984). For example, where the inmiunized animal is a mouse, one 
may use P3-X63/Ag8, X63-Ag8.653, NSl/l.Ag 4 1, Sp210-Agl4, FO, NSO/U, MPC-11, 
15 MPC11-X45-GTG 1.7 and S194/5XX0 Bui; for rats, one may use R210.RCY3, Y3-Ag L2.3, 
IR983F and 4B210; and U-266, GM1500-GRG2, LICR-LON-HMy2 and UC729-6 are all useful 
in connection with human cell fusions. 

Methods for generating hybrids of antibody-producing spleen or lymph node cells and 
20 myeloma cells usually comprise mixing somatic cells with myeloma cells in a 2:1 proportion, 
though the proportion may vary from about 20:1 to about 1:1, respectively, in the presence of an 
agent or agents (chemical or electrical) that promote the fusion of cell membranes. Fusion 
methods using Sendai virus have been described by Kohler and Milstein (1975; 1976), and those 
using polyethylene glycol (PEG), such as 37% (v/v) PEG, by Gefter et al (1977). The use of 
25 electrically induced fusion methods is also appropriate. 

Fusion procedures usually produce viable hybrids at low frequencies, about 1 X 10'^ to 
1 X 10'^. However, this does not pose a problem, as the viable, fused hybrids are differentiated 
from the parental, unfused cells (particularly the unfused myeloma cells that would normally 

-39- 

A: 225917(4%BH01! DOC) 



continue to divide indefinitely) by culturing in a selective medium. The selective medium is 
generally one that contains an agent that blocks the de novo synthesis of nucleotides in the tissue 
culture media. Exemplary and preferred agents are aminopterin, methotrexate, and azaserine. 
Aminopterin and methotrexate block de novo synthesis of both purines and pyrinaidines, whereas 
5 azaserine blocks only purine synthesis. Where aminopterin or methotrexate is used, the media is 
supplemented with hypoxanthine and thymidine as a source of nucleotides (HAT medium). 
Where azaserine is used, the media is supplemented with hypoxanthine. 

The preferred selection medium is HAT. Only cells capable of operating nucleotide 

10 salvage pathways are able to survive in HAT medium. The myeloma cells are defective in key 

« 

enzymes of the salvage pathway, e.g., hypoxanthine phosphoribosyl transferase (HPRT), and 
they cannot survive. The B cells can operate this pathway, but they have a limited life span in 
culture and generally die within about two weeks. Therefore, the only cells that can survive in 
the selective media are those hybrids formed from myeloma and B cells. 

15 

This culturing provides a population of hybridomas from which specific hybridomas are 
selected. Typically, selection of hybridomas is performed by culturing the cells by single-clone 
dilution in microtiter plates, followed by testing the individual clonal supematants (after about 
two to three weeks) for the desired reactivity. The assay should be sensitive, simple and rapid. 
20 Radioimmunoassays, enzyme inmiunoassays, cytotoxicity assays, plaque assays, dot 
inomunobinding assays, and the like, may be used. 

Where one desires to generate an antibody with defined activity, one would generally 
screen the candidate hybridomas to identify those hybridomas that produce antibodies that have 
25 the desired inhibitory or stimulatory properties. Any selected hybridomas would then be serially 
diluted and cloned into individual antibody-producing cell lines, which clones can then be 
propagated indefinitely to provide MAbs. The cell lines may be exploited for MAb production 
in two basic ways. A sample of the hybridoma can be injected (often into the peritoneal cavity) 
into a histocompatible animal of the type that was used to provide the somatic and myeloma 
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cells for the original fusion. The injected animal develops tumors secreting the specific 
monoclonal antibody produced by the fused cell hybrid. The body fluids of the animal, such as 
serum or ascites fluid, can then be tapped to provide MAbs in high concentration. The 
individual cell lines could also be cultured in vitro, where the MAbs are naturally secreted into 
5 the culture medium from which they can be readily obtained in high concentrations. MAbs 
produced by either means may be further purified, if desired, using filtration, centrifugation and 
various chromatographic methods such as HPLC or affinity chromatography. 

Certain aspects of the present invention relates to the detection of TLR-4 and mutants 
10 thereof. One method of detecting such compounds uses immunoassays for agents of the present 
invention. Antibodies that recognize TLR-4 or TLR-4 mutants of the present invention are 
contemplated to be useful in these immunoassays. 

Immunoassays encompassed by the present invention include, but are not limited to those 
15 described in U.S. Patent No. 4,367,1 10 (double monoclonal antibody sandwich assay) and U.S. 
Patent No. 4,452,901 (westem blot). Other assays include immunoprecipitation of labeled 
ligands and inamunocytochemistry, both in vitro and in vivo. 

Immunoassays, in their most simple and direct sense, are binding assays. Certain 
20 preferred immunoassays are the various types of enzyme linked immunosorbent assays (ELISAs) 
and radioimmunoassays (RIA) known in the art. Immunohistochemical detection using tissue 
sections is also particularly useful. 

In one exemplary ELIS A, the antibodies of the invention are immobilized onto a selected 
25 surface exhibiting protein affinity, such as a well in a polystyrene microtiter plate. Then, a test 
composition suspected of containing the desired antigen, such as a clinical sample, is added to 
the wells. After binding and washing to remove non-specifically bound immune complexes, the 
bound antigen may be detected. Detection is generally achieved by the addition of another 
antibody, specific for the desired antigen, that is linked to a detectable label. This type of ELIS A 
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is a simple "sandwich ELISA". Detection may also be achieved by the addition of a second 
antibody specific for the desired antigen, followed by the addition of a third antibody that has 
binding affinity for the second antibody, with the third antibody being linked to a detectable 
label. 

5 

In another exemplary ELIS A, the samples suspected of containing the desired antigen are 
immobilized onto the well surface and then contacted with the antibodies of the invention. After 
binding and appropriate washing, the bound immune complexes are detected. Where the initial 
antigen specific antibodies are linked to a detectable label, the immune complexes may be 
10 detected directly. Again, the immune complexes may be detected using a second antibody that 
has binding affinity for the first antigen specific antibody, with the second antibody being linked 
to a detectable label. 

Competition ELISAs are also possible in which test samples compete for binding with 
15 known amounts of labeled antigens or antibodies. The amount of reactive species in the 
unknown sample is determined by mixing the sample with the known labeled species before or 
during incubation with coated wells. The presence of reactive species in the sample acts to 
reduce the amount of labeled species available for binding to the well and thus reduces the 
ultimate signal. 

20 

Irrespective of the format employed, ELISAs have certain features in conamon, such as 
coating, incubating or binding, washing to remove non-specifically bound species, and detecting 
the bound immune complexes. These are described as below. 

25 Antigen or antibodies may also be linked to a solid support, such as in the form of plate, 

beads, dipstick, membrane or column matrix, and the sample to be analyzed applied to the 
immobilized antigen or antibody. In coating a plate with either antigen or antibody, one will 
generally incubate the wells of the plate with a solution of the antigen or antibody, either 
overnight or for a specified period. The wells of the plate will then be washed to remove 
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incompletely adsorbed material. Any remaining available surfaces of the wells are then "coated" 
with a nonspecific protein that is antigenically neutral with regard to the test antisera. These 
include bovine serum albumin (BSA), casein and solutions of nailk powder. The coating allows 
for blocking of nonspecific adsorption sites on the immobilizing surface and thus reduces the 
5 background caused by nonspecific binding of antisera onto the surface. 

In ELISAs, it is probably more customary to use a secondary or tertiary detection means 
rather than a direct procedure. Thus, after binding of the antigen or antibody to the well, coating 
with a non-reactive material to reduce background, and washing to remove unbound material, 
10 the immobilizing surface is contacted with the clinical or biological sample to be tested under 
conditions effective to allow inunune complex (antigen/antibody) formation. Detection of the 
immune complex then requires a labeled secondary binding ligand or antibody, or a secondary 
binding ligand or antibody in conjunction with a labeled tertiary antibody or third binding 
ligand. 

15 

"Under conditions effective to allow immune complex (antigen/antibody) formation" 
means that the conditions preferably include diluting the antigens and antibodies with solutions 
such as BSA, bovine gamma globulin (BGG) and phosphate buffered saline (PBS)/Tween. 
These added agents also tend to assist in the reduction of nonspecific background. 

20 

The suitable conditions also mean that the incubation is at a temperature and for a period 
of time sufficient to allow effective binding. Licubation steps are typically from about 1 to 2 to 
4 hours, at temperatures preferably on the order of 25° to TT^C, or may be ovemight at about 
4°C or so. 

25 

Following all incubation steps in an ELISA, the contacted surface is washed so as to 
remove non-complexed material. Washing often includes washing with a solution of 
PBS/Tween, or borate buffer. Following the formation of specific immune complexes between 
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the test sample and the originally bound material, and subsequent washing, the occurrence of 
even minute amounts of immune complexes may be determined. 

To provide a detecting means, the second or third antibody will have an associated label 
5 to allow detection. Preferably, this will be an enzyme that will generate color development upon 
incubating with an appropriate chromogenic substrate. Thus, for example, one will desire to 
contact and incubate the first or second inmiune complex with a urease, glucose oxidase, 
alkaline phosphatase or hydrogen peroxidase-conjugated antibody for a period of time and under 
conditions that favor the development of further immune complex formation, e.g., incubation for 
10 2 hours at room temperature in a PBS-containing solution such as PBS-Tween. 

After incubation with the labeled antibody, and subsequent to washing to remove 
unbound material, the amount of label is quantified, e.g., by incubation with a chromogenic 
substrate such as urea and bromocresol purple or 

15 2,2'-azino-di-(3-ethyl-benzthiazoline-6-sulfonic acid (ABTS) and H2O2, in the case of 
peroxidase as the enzyme label. Quantification is then achieved by measuring the degree of 
color generation, e.g., using a visible spectra spectrophotometer. 

Alternatively, the label may be a chemiluminescent one. The use of such labels is 
20 described in U.S. Patent Nos. 5,3 10,687, 5,238,808 and 5,221,605. 

Assays for TLR-4 and TLR-4 mutants of the present invention also can determine 
normal/abnormal tissue distribution for diagnostic purposes. Methods for in vitro and in situ 
analysis are well known and involve assessing binding of antigen-specific antibodies to tissues, 
25 cells or cell extracts. These are conventional techniques well within the grasp of those skilled in 
the art. For example, the antibodies of the present invention may be used in conjunction with 
both fresh-frozen and formalin-fixed, paraffin-embedded tissue blocks prepared for study by 
immunohistochemistry (IHC). Each tissue block may consist of 50 mg of residual "pulverized" 
prostate tumor. The method of preparing tissue blocks from these particulate specimens has 
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been successfully used in previous IHC studies of various prognostic factors, e.g., in breast 
cancer, and is well known to those of skill in the art. (Abbondanzo et al, 1990; AUred et al, 
1990; Browner a/., 1990) 

5 Briefly, frozen-sections may be prepared by rehydrating 50 ng of frozen pulverized tumor 

at room temperature in PBS in small plastic capsules; pelleting the particles by centrifiigation; 
resuspending them in a viscous embedding medium (OCT); inverting the capsule and pelleting 
again by centrifugation; snap-freezing in -70°C isopentane; cutting the plastic capsule and 
removing the frozen cylinder of tissue; securing the tissue cylinder on a cryostat microtome 
10 chuck; and cutting 25-50 serial sections containing an average of about 500 remarkably intact 
tumor cells. 

Permanent-sections may be prepared by a similar method involving rehydration of the 50 
mg sample in a plastic microfuge tube; pelleting; resuspending in 10% formalin for 4 hours 
15 fixation; washing/pelleting; resuspending in warm 2.5% agar; pelleting; cooling in ice water to 
harden the agar; removing the tissue/agar block from the tube; infiltrating and embedding the 
block in paraffin; and cutting up to 50 serial permanent sections. 

Nucleic Acid Detection 

20 In addition to their use in directing the expression of the TLR-4 protein, the nucleic acid 

sequences disclosed herein also have a variety of other uses. For example, they also have utility 
as probes or primers in nucleic acid hybridization embodiments. 

/♦ Hybridization 

25 The use of a hybridization probe of between 20 and 100 nucleotides in length allows the 

formation of a duplex molecule that is both stable and selective. Molecules having complementary 
sequences over stretches greater than 20 bases in length are generally preferred, in order to increase 
stability and selectivity of the hybrid, and thereby improve the quality and degree of particular 
hybrid molecules obtained. One will generally prefer to design nucleic acid molecules having 
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stretches of 20 to 30 nucleotides, or even longer where desired. Such fragments may be readily 
prepared by, for example, directly synthesizing the fragment by chemical means or by introducing 
selected sequences into recombinant vectors for recombinant production. These chemical means 
can include PGR ™ technology of U.S. Patent 4,603,102 (herein incorporated by reference) or by 
5 introducing selected sequences into recombinant vectors for recombinant production. 

Accordingly, the nucleotide sequences of the invention may be used for their abihty to 
selectively form duplex molecules with complementary stretches of genes or RNAs or to provide 
primers for amplification of DNA or RNA from tissues. Depending on the application envisioned, 
10 one will desire to employ varying conditions of hybridization to achieve varying degrees of 
selectivity of probe towards target sequence. 

For certain applications, for example, substitution of nucleotides by site-directed 
mutagenesis, it is appreciated that lower stringency conditions are required. Under these 

15 conditions, hybridization may occur even though the sequences of probe and target strand are not 
perfectly complementary, but are mismatched at one or more positions. Conditions may be 
rendered less stringent by increasing salt concentration and decreasing temperature. For example, a 
low stringency hybridization conditions for the present invention of hybridization in 35% 
formamide, 5x Denhardts' solution, 5x SSC, 25 mM sodium phosphate, 0.1% SDS and 100 

20 |iig/ml denatured salmon sperm DNA or coli DNA at 42^C for 16 hours followed by 
sequential washes with 0.8x SSC, 0.1% SDS at 55°C allows for cross-species hybridization to 
homologous proteins to occur. Thus, hybridization conditions can be readily manipulated 
depending on the desired results. 

25 In other embodiments, hybridization may be achieved under conditions of, for example, 

50% formamide, 5x Denhardts' solution, 5x SSC, 25 mM sodium phosphate, 0.1% SDS and 100 
|ag/ml of denatured salmon sperm DNA at 42°C for 16 hours followed by 1 hour sequential 
washes with O.lx SSC, 0.1% SDS solution at 60°C to remove the desired amount of background 
signal. 
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In certain embodiments, it will be advantageous to employ nucleic acid sequences of the 
present invention in combination with an appropriate means, such as a label, for determining 
hybridization. A wide variety of appropriate indicator means are known in the art, including 
5 fluorescent, radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of 
being detected. In preferred embodiments, one may desire to employ a fluorescent label or an 
enzyme tag such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other 
environmentally undesirable reagents. In the case of enzyme tags, colorimetric indicator substrates 
are known that can be employed to provide a detection means visible to the human eye or 
10 spectrophotometrically, to identify specific hybridization with complementary nucleic acid- 
containing samples. 

In general, it is envisioned that the hybridization probes described herein will be useful 
both as reagents in solution hybridization, as in PCR^'^, for detection of expression of 

15 corresponding genes, as well as in embodiments employing a soUd phase. In embodiments 
involving a solid phase, the test DNA (or RNA) is adsorbed or otherwise affixed to a selected 
matrix or surface. This fixed, single-stranded nucleic acid is then subjected to hybridization with 
selected probes under desired conditions. The selected conditions will depend on the particular 
circumstances based on the particular criteria required (depending, for example, on the G-hC 

20 content, type of target nucleic acid, source of nucleic acid, size of hybridization probe, eta). 
Following washing of the hybridized surface to remove non-specifically bound probe molecules, 
hybridization is detected, or even quantified, by means of the label. 

it Amplification and PCR ^ 

25 Nucleic acid used as a template for amplification is isolated from cells contained in the 

biological sample, according to standard methodologies (Sambrook et al, 1989). The nucleic 
acid may be genomic DNA or fractionated or whole cell RNA. Where RNA is used, it may be 
desired to convert the RNA to a cDNA. 
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Pairs of primers that selectively hybridize to nucleic acids corresponding to a TLR-4 
protein or a mutant thereof are contacted with the isolated nucleic acid under conditions that 
permit selective hybridization. The term "primer", as defined herein, is meant to encompass any 
nucleic acid that is capable of priming the synthesis of a nascent nucleic acid in a template- 
5 dependent process. Typically, primers are oligonucleotides from ten to twenty base pairs in 
length, but longer sequences can be employed. Primers may be provided in double-stranded or 
single-stranded form, although the single-stranded form is preferred. 

Once hybridized, the nucleic acid:primer complex is contacted with one or more 
10 enzymes that facilitate template-dependent nucleic acid synthesis. Multiple rounds of 
amplification, also referred to as "cycles," are conducted until a sufficient amount of 
amplification product is produced. 

Next, the amplification product is detected. In certain applications, the detection may be 
15 performed by visual means. Alternatively, the detection may involve indirect identification of 
the product via chemiluminescence, radioactive scintigraphy of incorporated radiolabel or 
fluorescent label or even via a system using electrical or thermal impulse signals (Affymax 
technology). 

20 A number of template dependent processes are available to amplify the marker sequences 

present in a given template sample. One of the best known amplification methods is the 
polymerase chain reaction (referred to as PCR"^^) which is described in detail in U.S. Patent Nos. 
4,683,195, 4,683,202 and 4,800,159, and each incorporated herein by reference in entirety. 

25 Briefly, in PGR™, two primer sequences are prepared that are complementary to regions 

on opposite complementary strands of the marker sequence. An excess of deoxynucleoside 
triphosphates are added to a reaction mixture along with a DNA polymerase, e.g.. Tag 
polymerase. If the marker sequence is present in a sample, the primers will bind to the marker 
and the polymerase will cause the primers to be extended along the marker sequence by adding 
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on nucleotides. By raising and lowering the temperature of the reaction mixture, the extended 
primers will dissociate from the marker to form reaction products, excess primers will bind to 
the marker and to the reaction products and the process is repeated. 

5 A reverse transcriptase PCR"^*^ (RT-PCR^"^) amplification procedure may be performed in 

order to quantify the amount of mRNA amplified or to prepare cDNA from the desired mRNA. 
Methods of reverse transcribing RNA into cDNA are well known and described in Sambrook et 
al, 1989. Alternative methods for reverse transcription utilize thermostable, RNA-dependent 
DNA polymerases. These methods are described in WO 90/07641, filed December 21, 1990, 
10 incorporated herein by reference. Polymerase chain reaction methodologies are well known in 
the art. 

Another method for amplification is the ligase chain reaction ("LCR"), disclosed in EPA 
No. 320 308, incorporated herein by reference in its entirety. In LCR, two complementary probe 

15 pairs are prepared, and in the presence of the target sequence, each pair will bind to opposite 
complementary strands of the target such that they abut. In the presence of a ligase, the two 
probe pairs will link to form a single unit. By temperature cycling, as in PCR™, bound ligated 
units dissociate from the target and then serve as "target sequences" for ligation of excess probe 
pairs. U.S. Patent 4,883,750 describes a method similar to LCR for binding probe pairs to a 

20 target sequence. 

Qbeta Replicase, described in PCT Application No. PCTAJS87/00880, incorporated 
herein by reference, may also be used as still another amplification method in the present 
invention. In this method, a replicative sequence of RNA that has a region complementary to 
25 that of a target is added to a sample in the presence of an RNA polymerase. The polymerase will 
copy the replicative sequence that can then be detected. 

An isothermal amplification method, in which restriction endonucleases and ligases are 
used to achieve the amplification of target molecules that contain nucleotide 5'-[alpha-thio]- 
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triphosphates in one strand of a restriction site may also be useful in the amplification of nucleic 
acids in the present invention. 

Strand Displacement Amplification (SDA) is another method of carrying out isothermal 
5 amplification of nucleic acids which involves multiple rounds of strand displacement and 
synthesis, Le., nick translation. A similar method, called Repair Chain Reaction (RCR), 
involves annealing several probes throughout a region targeted for amplification, followed by a 
repair reaction in which only two of the four bases are present. The other two bases can be 
added as biotinylated derivatives for easy detection. A similar approach is used in SDA. Target 

10 specific sequences can also be detected using a cyclic probe reaction (CPR). In CPR, a probe 
having 3' and 5' sequences of non-specific DNA and a middle sequence of specific RNA is 
hybridized to DNA that is present in a sample. Upon hybridization, the reaction is treated with 
RNase H, and the products of the probe identified as distinctive products that are released after 
digestion. The original template is annealed to another cycling probe and the reaction is 

15 repeated. 

Still another amplification methods described in GB Application No. 2 202 328, and in 
PCT Application No. PCT/US89/01025, each of which is incorporated herein by reference in its 
entirety, may be used in accordance with the present invention. In the former application, 

20 "modified" primers are used in a PCR'^^-like, template- and enzyme-dependent synthesis. The 
primers may be modified by labeling with a capture moiety (e.g., biotin) and/or a detector moiety 
(e.g., enzyme). In the latter application, an excess of labeled probes are added to a sample. In 
the presence of the target sequence, the probe binds and is cleaved catalytically. After cleavage, 
the target sequence is released intact to be bound by excess probe. Cleavage of the labeled probe 

25 signals the presence of the target sequence. 

Other nucleic acid amplification procedures include transcription-based amplification 
systems (TAS), including nucleic acid sequence based amplification (NASBA) and 3SR 
Gingeras etal, PCT Application WO 88/10315, incorporated herein by reference. In NASBA, 
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the nucleic acids can be prepared for amplification by standard phenol/chloroform extraction, 
heat denaturation of a clinical sample, treatment with lysis buffer and minispin columns for 
isolation of DNA and RNA or guanidinium chloride extraction of RNA. These amplification 
techniques involve annealing a primer which has target specific sequences. Following 
5 polymerization, DNA/RNA hybrids are digested with RNase H while double stranded DNA 
molecules are heat denatured again. In either case the single stranded DNA is made fully double 
stranded by addition of second target specific primer, followed by polymerization. The double- 
stranded DNA molecules are then multiply transcribed by an RNA polymerase such as T7 or 
SP6. In an isothermal cyclic reaction, the RNA's are reverse transcribed into single stranded 
10 DNA, which is then converted to double stranded DNA, and then transcribed once again with an 
RNA polymerase such as T7 or SP6. The resulting products, whether truncated or complete, 
indicate target specific sequences. 

Davey etal, EPA No. 329 822 (incorporated herein by reference in its entirety) disclose 
15 a nucleic acid amplification process involving cyclically synthesizing single-stranded RNA 
("ssRNA"), ssDNA, and double-stranded DNA (dsDNA), which may be used in accordance with 
the present invention. The ssRNA is a template for a first primer oligonucleotide, which is 
elongated by reverse transcriptase (RNA-dependent DNA polymerase). The RNA is then 
removed from the resulting DNA:RNA duplex by the action of ribonuclease H (RNase H, an 
20 RNase specific for RNA in duplex with either DNA or RNA). The resultant ssDNA is a 
template for a second primer, which also includes the sequences of an RNA polymerase 
promoter (exemplified by T7 RNA polymerase) 5' to its homology to the template. This primer 
is then extended by DNA polymerase (exemplified by the large "Klenow" fragment of E. coli 
DNA polymerase I), resulting in a double-stranded DNA ("dsDNA") molecule, having a 
25 sequence identical to that of the original RNA between the primers and having additionally, at 
one end, a promoter sequence. This promoter sequence can be used by the appropriate RNA 
polymerase to make many RNA copies of the DNA. These copies can then re-enter the cycle 
leading to very swift amplification. With proper choice of enzymes, this amplification can be 
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done isothermally without addition of enzymes at each cycle. Because of the cyclical nature of 
this process, the starting sequence can be chosen to be in the form of either DNA or RNA. 

Miller etal, PCT Application WO 89/06700 (incorporated herein by reference in its 
5 entirety) disclose a nucleic acid sequence amplification scheme based on the hybridization of a 
promoter/primer sequence to a target single-stranded DNA ("ssDNA") followed by transcription 
of many RNA copies of the sequence. This scheme is not cyclic, i.e., new templates are not 
produced from the resultant RNA transcripts. Other amplification methods include "RACE" and 
"one-sided PGR" (Frohman, 1990, incorporated by reference). 

10 

Methods based on ligation of two (or more) oligonucleotides in the presence of nucleic 
acid having the sequence of the resulting "di-oligonucleotide", thereby amplifying the 
di-oligonucleotide, may also be used in the ampUfication step of the present invention. 

15 Following any amplification, it may be desirable to separate the amplification product 

from the template and the excess primer for the purpose of determining whether specific 
amplification has occurred. In one embodiment, amplification products are separated by 
agarose, agarose-acrylamide or polyacrylamide gel electrophoresis using standard methods. See 
Sambrook etaL, 1989. 

20 

Alternatively, chromatographic techniques may be employed to effect separation. There 
are many kinds of chromatography which may be used in the present invention: adsorption, 
partition, ion-exchange and molecular sieve, and many specialized techniques for using them 
including colunm, paper, thin-layer and gas chromatography. 

25 

Amplification products must be visualized in order to confirm amplification of the 
marker sequences. One typical visualization method involves staining of a gel with ethidium 
bromide and visualization under UV light. Alternatively, if the ampUfication products are 
integrally labeled with radio- or fluorometrically-labeled nucleotides, the amplification products 
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can then be exposed to x-ray film or visualized under the appropriate stimulating spectra, 
following separation. 

In one embodiment, visualization is achieved indirectly. Following separation of 
5 amplification products, a labeled, nucleic acid probe is brought into contact with the amplified 
marker sequence. The probe preferably is conjugated to a chromophore but may be 
radiolabeled. In another embodiment, the probe is conjugated to a binding partner, such as an 
antibody or biotin, and the other member of the binding pair carries a detectable moiety. 

10 In one embodiment, detection is by Southern blotting and hybridization with a labeled 

probe. The techniques involved in Southern blotting are well known to those of skill in the art 
and can be found in many standard books on molecular protocols. See Sambrook et al, 1989. 
Briefly, amplification products are separated by gel electrophoresis. The gel is then contacted 
with a membrane, such as nitrocellulose, permitting transfer of the nucleic acid and non-covalent 

15 binding. Subsequently, the membrane is incubated with a chromophore-conjugated probe that is 
capable of hybridizing with a target amplification product. Detection is by exposure of the 
membrane to x-ray film or ion-emitting detection devices. 

One example of the foregoing is described in U.S. Patent No. 5,279,721, incorporated by 
20 reference herein, which discloses an apparatus and method for the automated electrophoresis and 
transfer of nucleic acids. The apparatus permits electrophoresis and blotting without external 
manipulation of the gel and is ideally suited to carrying out methods according to the present 
invention. 

25 All the essential materials and reagents required for detecting TLR-4 protein markers in a 

biological sample may be assembled together in a kit. This generally will comprise preselected 
primers for specific markers. Also included may be enzymes suitable for amplifying nucleic 
acids including various polymerases (RT, Taq, etc), deoxynucleotides and buffers to provide the 
necessary reaction mixture for amplification. 
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Such kits generally will comprise, in suitable means, distinct containers for each 
individual reagent and enzyme as well as for each marker primer pair. Preferred pairs of primers 
for amplifying nucleic acids are selected to amplify the sequences specified in SEQ ID N0:1, 
5 SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID 
NO:48. 

In another embodiment, such kits will comprise hybridization probes specific for TLR-4 
protein chosen from a group including nucleic acids corresponding to the sequences specified in 
10 SEQ ID N0:1, SEQ ID N0:3, SEQ ID NO:5, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47 
or SEQ ID NO:48. Such kits generally will comprise, in suitable means, distinct containers for 
each individual reagent and enzyme as well as for each marker hybridization probe. 

iiu Other Assays 

15 Other methods for genetic screening to accurately detect genetic changes which may be 

caused by disease, such as bacterial infections that alter normal cellular production and 
processing, in genomic DNA, cDNA or RNA samples may be employed, depending on the 
specific situation. 

20 For example, one method of screening for genetic variation is based on RNase cleavage 

of base pair mismatches in RNA/DNA and RNA/RNA heteroduplexes. As used herein, the term 
"nnismatch" is defined as a region of one or more unpaired or mispaired nucleotides in a double- 
stranded RNA/RNA, RNA/DNA or DNA/DNA molecule. This definition thus includes 
mismatches due to insertion/deletion mutations, as well as single and multiple base point 

25 mutations. 

U.S. Patent No. 4,946,773 describes an RNase A mismatch cleavage assay that involves 
anneaUng single-stranded DNA or RNA test samples to an RNA probe, and subsequent 
treatment of the nucleic acid duplexes with RNase A. After the RNase cleavage reaction, the 
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RNase is inactivated by proteolytic digestion and organic extraction, and the cleavage products 
are denatured by heating and analyzed by electrophoresis on denaturing polyacrylamide gels. 
For the detection of mismatches, the single-stranded products of the RNase A treatment, 
electrophoretically separated according to size, are compared to similarly treated control 
5 duplexes. Samples containing smaller fragments (cleavage products) not seen in the control 
duplex are scored as -h. 

Currently available RNase mismatch cleavage assays, including those performed 
according to U.S. Patent No. 4,946,773, require the use of radiolabeled RNA probes. Myers and 

10 Maniatis in U.S. Patent No. 4,946,773 describe the detection of base pair mismatches using 
RNase A. Other investigators have described the use of E. coli enzyme, RNase I, in mismatch 
assays. Because it has broader cleavage specificity than RNase A, RNase I would be a desirable 
enzyme to employ in the detection of base pair mismatches if components can be found to 
decrease the extent of non-specific cleavage and increase the frequency of cleavage of 

15 mismatches. The use of RNase I for mismatch detection is described in literature from Promega 
Biotech. Promega markets a kit containing RNase I that is shown in their literature to cleave 
three out of four known mismatches, provided the enzyme level is sufficiently high. 

The RNase protection assay was first used to detect and map the ends of specific mRNA 
20 targets in solution. The assay relies on being able to easily generate high specific activity 
radiolabeled RNA probes complementary to the mRNA of interest by in vitro transcription. 
Originally, the templates for in vitro transcription were recombinant plasmids containing 
bacteriophage promoters. The probes are mixed with total cellular RNA samples to permit 
hybridization to their complementary targets, then the mixture is treated with RNase to degrade 
25 excess unhybridized probe. Also, as originally intended, the RNase used is specific for single- 
stranded RNA, so that hybridized double-stranded probe is protected from degradation. After 
inactivation and removal of the RNase, the protected probe (which is proportional in amount to 
the amount of target mRNA that was present) is recovered and analyzed on a polyacrylamide 
gel. 
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The RNase Protection assay was adapted for detection of single base mutations. In this 
type of RNase A mismatch cleavage assay, radiolabeled RNA probes transcribed in vitro from 
wild type sequences, are hybridized to complementary target regions derived from test samples. 
5 The test target generally comprises DNA (either genomic DNA or DNA amplified by cloning in 
plasmids or by PGR™), although RNA targets (endogenous mRNA) have occasionally been 
used. If single nucleotide (or greater) sequence differences occur between the hybridized probe 
and target, the resulting disruption in Watson-Crick hydrogen bonding at that position 
("mismatch") can be recognized and cleaved in some cases by single-strand specific 
10 ribonuclease. To date, RNase A has been used almost exclusively for cleavage of single-base 
mismatches, although RNase I has recently been shown as useful also for mismatch cleavage. 
There are recent descriptions of using the MutS protein and other DNA-repair enzymes for 
detection of single-base mismatches. 

15 G. Recombinant Vectors, Host Cells and Expression 

Recombinant vectors form important further aspects of the present invention. The term 
"expression vector or construct" means any type of genetic construct containing a nucleic acid 
coding for a gene product in which part or all of the nucleic acid encoding sequence is capable of 
being transcribed. The transcript may be translated into a protein, but it need not be. Thus, in 
20 certain embodiments, expression includes both transcription of a gene and translation of a RNA 
into a gene product. In other embodiments, expression only includes transcription of the nucleic 
acid, for example, to generate antisense constructs. 

Particularly useful vectors are contemplated to be those vectors in which the coding 
25 portion of the DNA segment, whether encoding a full length protein or smaller peptide, is 
positioned under the transcriptional control of a promoter. A "promoter" refers to a DNA 
sequence recognized by the synthetic machinery of the cell, or introduced synthetic machinery, 
required to initiate the specific transcription of a gene. The phrases "operatively positioned", 
"under control" or "under transcriptional control" means that the promoter is in the correct 
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location and orientation in relation to the nucleic acid to control RNA polymerase initiation and 
expression of the gene. 

The promoter may be in the form of the promoter that is naturally associated with a TLR- 
5 4 protein gene, as may be obtained by isolating the 5' non-coding sequences located upstream of 
the coding segment or exon, for example, using recombinant cloning and/or PCR'^'^ technology, 
in connection with the compositions disclosed herein (PCR"^"^ technology is disclosed in U.S. 
Patent 4,683,202 and U.S. Patent 4,682,195, each incorporated herein by reference). 
Additionally, the toll-like receptors from Drosophila are well known to those of skill in the art. 
10 The promoter regions of these nucleic acids may be useful herein. 

In other embodiments, it is contemplated that certain advantages will be gained by 
positioning the coding DNA segment under the control of a recombinant, or heterologous, 
promoter. As used herein, a recombinant or heterologous promoter is intended to refer to a 
15 promoter that is not normally associated with a TLR-4 protein gene in its natural environment. 
Such promoters may include promoters normally associated with other genes, and/or promoters 
isolated from any other bacterial, viral, eukaryotic, or mammalian cell. 

Naturally, it will be important to employ a promoter that effectively directs the 
20 expression of the DNA segment in the cell type, organism, or even animal, chosen for 
expression. The use of promoter and cell type combinations for protein expression is generally 
known to those of skill in the art of molecular biology, for example, see Sambrook et al (1989), 
incorporated herein by reference. The promoters employed may be constitutive, or inducible, 
and can be used under the appropriate conditions to direct high level expression of the 
25 introduced DNA segment, such as is advantageous in the large-scale production of recombinant 
proteins or peptides. 

At least one module in a promoter functions to position the start site for RNA synthesis. 
The best known example of this is the TATA box, but in some promoters lacking a TATA box, 
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such as the promoter for the mammalian terminal deoxynucleotidyl transferase gene and the 
promoter for the S V40 late genes, a discrete element overlying the start site itself helps to fix the 
place of initiation. 

5 Additional promoter elements regulate the frequency of transcriptional initiation. 

Typically, these are located in the region 30-1 10 bp upstream of the start site, although a number 
of promoters have been shown to contain functional elements downstream of the start site as 
well. The spacing between promoter elements frequently is flexible, so that promoter function is 
preserved when elements are inverted or moved relative to one another. In the thymidine kinase 
10 (tk) promoter, the spacing between promoter elements can be increased to 50 bp apart before 
activity begins to decline. Depending on the promoter, it appears that individual elements can 
function either co-operatively or independently to activate transcription. 

The particular promoter that is employed to control the expression of a nucleic acid is not 
15 believed to be critical, so long as it is capable of expressing the nucleic acid in the targeted cell. 
Thus, where a human cell is targeted, it is preferable to position the nucleic acid coding region 
adjacent to and under the control of a promoter that is capable of being expressed in a human 
cell. Generally speaking, such a promoter might include either a human or viral promoter. 
Preferred promoters include those derived from HSV, including the HNFla promoter. Another 
20 preferred embodiment is the tetracycline controlled promoter. 

In various other embodiments, the human cytomegalovirus (CMV) immediate early gene 
promoter, the Simian virus 40 (SV40) early promoter and the Rous sarcoma virus long terminal 
repeat can be used to obtain high-level expression of transgenes. The use of other viral or 
25 mammalian cellular or bacterial phage promoters which are well-known in the art to achieve 
expression of a transgene is contemplated as well, provided that the levels of expression are 
sufficient for a given purpose. The following tables list several elements/promoters which may 
be employed, in the context of the present invention, to regulate the expression of TLR-4 protein 
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or subunit gene. This list is not intended to be exhaustive of all the possible elements involved 
in the promotion of transgene expression but, merely, to be exemplary thereof. 

Enhancers were originally detected as genetic elements that increased transcription from 
5 a promoter located at a distant position on the same molecule of DNA. This ability to act over a 
large distance had little precedent in classic studies of prokaryotic transcriptional regulation. 
Subsequent work showed that regions of DNA with enhancer activity are organized much like 
promoters. That is, they are composed of many individual elements, each of which binds to one 
or more transcriptional proteins. 

10 

The basic distinction between enhancers and promoters is operational. An enhancer 
region as a whole must be able to stimulate transcription at a distance; this need not be true of a 
promoter region or its component elements. On the other hand, a promoter must have one or 
more elements that direct initiation of RNA synthesis at a particular site and in a particular 
15 orientation, whereas enhancers lack these specificities. Promoters and enhancers are often 
overlapping and contiguous, often seeming to have a very similar modular organization. 

Additionally any promoter/enhancer combination (as per the Eukaryotic Promoter Data 
Base EPDB) could also be used to drive expression of a transgene. Use of a T3, T7 or SP6 
20 cytoplasmic expression system is another possible embodiment. Eukaryotic cells can support 
cytoplasmic transcription from certain bacterial promoters if the appropriate bacterial 
polymerase is provided, either as part of the delivery complex or as an additional genetic 
expression construct. 
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PROMOTER TABLE 



PROMOTER 


REFERENCES 


Immunoglobulin Heavy Chain 


Hanerji etal, 1983; Gilles et al, 1983; Grosschedl and Baltimore, 
1985; Imler et al, 1987; Weinberger et al, 1988; Kiledjian et al, 
1988;Porton etal, 1990 


Immunoglobulin Light Chain 


Queen and Baltimore, 1983; Picard and Schaffner, 1984 


T-Cell Receptor 


Luria etal, 1987, Winoto and Baltimore, 1989; Redondo etal, 1990 


HLA DQ a and DQ B 


Sullivan and Peterlin, 1987 


6-Interferon 


Goodbourn etal, 1986; Fujita et al, 1987; Goodbourn and Maniatis, 
1985 


Interleukin-2 


Greene etal, 1989 


Interleukin-2 Receptor 


Greene et al, 1989; Lin et al, 1990 


MHC Class II 5 


Koch etal, 1989 


MHC Class II HLA-DRa 


Sherman effl/., 1989 


B-Actin 


Kawamoto i7/ Na /?f /?/ lOSO 


Muscle Creatine Kinase 


Javnes^^/z/ 1Q88- Hnrlick anH RenfielH 1 0JiQ- TnhnQon /7/ lORQa 


Prealbumin (Transthyretin) 


Qo^X3.etal, 1988 


Elastase / 


Omitz etal, 1987 


Metallothionein 


Karin et al, 1987; Culotta and Hamer, 1989 


Collagenase 


Pinkert etal, 1987; Angel etal, 1987 


Albumin Gene 


Vivk&netal, 1987, Tronche eM/., 1989, 1990 


a-Fetoprotein 


Go^houi etal, 1988 


T-Globin 


Bodine and Ley, 1987; Perez-Stable and Constantini, 1990 


B-Globin 


Trudel and Constantini, 1987 


e-fos 


Coh&n etal, 1987 


c-HA-ras 


Triesman, 1986; Deschamps etal, 1985 


Insulin 


Edlund etal, 1985 


Neural Cell Adhesion Molecule 
(NCAM) 


Hirsch etal, 1990 


^l-Antitrypain 


Latimer effl/., 1990 


H2B (TH2B) Histone 


Hwang eM/., 1990 
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PROMOTER 


REFERENCES 


Mouse or Type I Collagen 


Ripe etal, 1989 


Glucose-Regulated Proteins (GRP94 and 
GRP78) 


Chang etal, 1989 


Rat Growth Hormone 


Larsen etal, 1986 


Human Serum Amyloid A (S AA) 


Edbrookeera/., 1989 


Troponin I (TN I) 


Yutzey etal, 1989 


Platelet-Derived Growth Factor 


Pech etal, 1989 


Duchenne Muscular Dystrophy 


Klamute/a/., 1990 


SV40 


Banerji et al, 1981; Moreau et al, 1981; Sleigh and Lockett, 1985; 
Firak and Subramanian, 1986; Herr and Clarke, 1986; Imbra and 
Karin, 1986; Kadesch and Berg, 1986; Wang and Calame, 1986; 
Ondek etal, 1987; Kuhl etal, 1987 Schaffner etal, 1988 


Polyoma 


Swartzendruber and Lehman, 1975; Vasseur etal, 1980; Katinka et 
al, 1980, 1981; Tyndell etaL, 1981; Dandolo etal, 1983; deVilliers 
etaL, 1984; Hen etaL, 1986; Satake etaL, 1988; Campbell and 
Villarreal, 1988 


Retroviruses 


Kriegler and Botchan, 1982, 1983; Levinson etaL, 1982; Kriegler et 
ai, 1983, 1984a,b, 19SS; Boszc et aL, 1986; Miksicek a/. , 1986; 
Celander and Haseltine, 1987; Thiesen et ai, 1988; Celander et ai, 
1988; Choi etaL, 1988; Reisman and Rotter, 1989 


Papilloma Virus 


Campo etaL, 1983; Lusky etaL, 1983; Spandidos and Wilkie, 1983; 
Spalholz etaL, 1985; Lusky and Botchan, 1986; Cripe etaL, 1987; 
Gloss etaL, 1987; Hirochika etaL, 1987, Stephens and Hentschel, 
1987; Glue etaL, 1988 


Hepatitis B Virus 


Bulla and Siddiqui, 1986; Jameel and Siddiqui, 1986; Shaul and Ben- 
Levy, 1987; Spandau and Lee, 1988; Vannice and Levinson, 1988 


Human Immunodeficiency Virus 


Muesing etaL, 1987; Hauber and CuUan, 1988; Jakobovits etaL, 
1988; Feng and Holland, 1988; Takebe et aL, 1988; Rowen et aL, 
1988; Berkhout etaL, 1989; Laspia etaL, 1989; Sharp and 
Marciniak, 1989; Braddock et aL, 1989 


Cytomegalovirus 


Weber et aL, 1984; Boshart et aL, 1985; Foecking and Hofstetter, 
1986 


Gibbon Ape Leukemia Virus 


Holbrook etaL, 1987; Quinn et aL, 1989 
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ENHANCER TABLE 





Inducer 


References 


MT II 


Phorbol Ester (TFA) 
Heavy metals 


Palmiter et al, 1982; Haslinger and Karin, 

1985; Imagawa etal, 1987; Karin (S>, 
1987; Angel et al, 1987b; McNeall et al, 
1989 


MMTV (mouse mammary 
tumor virus) 


Glucocorticoids 


Huang etal, 1981; Lee etal, 1981; 
Majors and Varmus, 1983; Chandler etal, 
1983; Lee et al, 1984; Fonta et al, 1985; 

OcL^uI Lit. J iJ70V 


6-Interferon 


DolvCrDX 
poly(rc) 


Tavemier /?/ IQR'^ 


Adenovirus 5 E2 


Ela 


XllipCiiCliC cLiUJ. l^lCVlllaj I^Ot- 


Collaffenase 


Phorhol Filter (TPA\ 


Ancle 10S7a 


Stromelysin 


Phorhol Fster fTPA'k 


Anale/'^/?/ 1QS7h 


SV40 


Phorhol Fster ("TFA^ 




Murine MX Gene 


Tnterfernn Newcastle Disease 
Virus 




GRP78 Gene 


A23187 


l?esenHe7 /?/ lOSS 


cx-2-Macroglobulin 


IL-6 


Kunzef^/ 1989 


Vimentin 


Senim 


Rittlina /i/ 1 ORO 
iviiLiiiig et tit., kyoy 


MHC Class I Gene H-2kb 


Interferon 


Blanar era/., 1989 


HSP70 


Ela, S V40 Large T Antigen 


Taylor et al, 1989; Taylor and Kingston, 
1990a,b 


Proliferin 


Phorbol Ester-TPA 


Mordacq and Linzer, 1989 


Tumor Necrosis Factor 


FMA 


Hensel etal, 1989 


Thyroid Stimulating Hormone 
a Gene 


Thyroid Hormone 


Chatterjeeera/., 1989 
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Turning to the expression of the TLR-4 proteins of the present invention, once a suitable 
clone or clones have been obtained, whether they be cDNA based or genomic, one may proceed 
to prepare an expression system. The engineering of DNA segment(s) for expression in a 
prokaryotic or eukaryotic system may be performed by techniques generally known to those of 
skill in recombinant expression. It is believed that virtually any expression system may be 
employed in the expression of the proteins of the present invention. 

Both cDNA and genomic sequences are suitable for eukaryotic expression, as the host 
cell will generally process the genomic transcripts to yield functional mRNA for translation into 
protein. Generally speaking, it may be more convenient to employ as the recombinant gene a 
cDNA version of the gene. It is believed that the use of a cDNA version will provide advantages 
in that the size of the gene will generally be much smaller and more readily employed to 
transfect the targeted cell than will a genomic gene, which will typically be up to an order of 
magnitude larger than the cDNA gene. However, the possibility of employing a genomic 
version of a particular gene where desired is not excluded. 

In expression, one will typically include a polyadenylation signal to effect proper 
polyadenylation of the transcript. The nature of the polyadenylation signal is not believed to be 
crucial to the successful practice of the invention, and any such sequence may be employed. 
Preferred embodiments include the SV40 polyadenylation signal and the bovine growth 
hormone polyadenylation signal, convenient and known to function well in various target cells. 
Also contemplated as an element of the expression cassette is a terminator. These elements can 
serve to enhance message levels and to minimize read through from the cassette into other 
sequences. 

A specific initiation signal also may be required for efficient translation of coding 
sequences. These signals include the ATG initiation codon and adjacent sequences. Exogenous 
translational control signals, including the ATG initiation codon, may need to be provided. One 
of ordinary skill in the art would readily be capable of determining this and providing the 
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necessary signals. It is well known that the initiation codon must be "in-frame" with the reading 
frame of the desired coding sequence to ensure translation of the entire insert. The exogenous 
translational control signals and initiation codons can be either natural or synthetic. The 
efficiency of expression may be enhanced by the inclusion of appropriate transcription enhancer 
5 elements. 

As used herein, the terms "engineered" and "recombinant" cells are intended to refer to a 
cell into which an exogenous DNA segment or gene, such as a cDNA or gene encoding a TLR-4 
protein has been introduced. Therefore, engineered cells are distinguishable from naturally 
10 occurring cells which do not contain a recombinantly introduced exogenous DNA segment or 
gene. Engineered cells are thus cells having a gene or genes introduced through the hand of 
man. Recombinant cells include those having an introduced cDNA or genomic gene, and also 
include genes positioned adjacent to a promoter not naturally associated with the particular 
introduced gene. 

15 

To express a recombinant TLR-4 protein, whether mutant or wild-type, in accordance 
with the present invention one would prepare an expression vector that comprises a TLR-4 
protein-encoding nucleic acid under the control of one or more promoters. To bring a coding 
sequence "under the control of a promoter, one positions the 5' end of the transcription 
20 initiation site of the transcriptional reading frame generally between about 1 and about 50 
nucleotides "downstream" of (Le., 3' of) the chosen promoter. The "upstream" promoter 
stimulates transcription of the DNA and promotes expression of the encoded recombinant 
protein. This is the meaning of "recombinant expression" in this context. 

25 Many standard techniques are available to construct expression vectors containing the 

appropriate nucleic acids and transcriptional/translational control sequences in order to achieve 
protein or peptide expression in a variety of host-expression systems. Cell types available for 
expression include, but are not limited to, bacteria, such as £. coli and B. subtilis transformed 
with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors. 
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Certain examples of prokaryotic hosts are E. coli strain RRl, coli LE392, E. coliB, 
E. coli X 1776 (ATCC No. 31537) as well as £. coli W31 10 (F-, lambda-, prototrophic, ATCC 
No. 273325); bacilli such as Bacillus subtilis; and other enterobacteriaceae such as Salmonella 
5 typhimurium, Serratia marcescens, and various Pseudomonas species. 

In general, plasmid vectors containing replicon and control sequences which are derived 
from species compatible with the host cell are used in connection with these hosts. The vector 
ordinarily carries a replication site, as well as marking sequences which are capable of providing 
10 phenotypic selection in transformed cells. For example, coli is often transformed using 
pBR322, a plasmid derived from an E, coli species. pBR322 contains genes for ampicillin and 
tetracycline resistance and thus provides easy means for identifying transformed cells. The pBR 
plasmid, or other microbial plasmid or phage must also contain, or be modified to contain, 
promoters which can be used by the microbial organism for expression of its own proteins. 

15 

In addition, phage vectors containing replicon and control sequences that are compatible 
with the host microorganism can be used as transforming vectors in connection with these hosts. 
For example, the phage lambda GEM™- 11 may be utilized in making a recombinant phage 
vector which can be used to transform host cells, such as E, coli LE392. 

20 

Further useful vectors include pESf vectors and pGEX vectors, for use in generating 
glutathione S-transferase (GST) soluble fusion proteins for later purification and separation or 
cleavage. Other suitable fusion proteins are those with 6-galactosidase, ubiquitin, the like. 

25 Promoters that are most conmionly used in recombinant DNA construction include the b- 

lactamase (penicillinase), lactose and tryptophan (trp) promoter systems. While these are the 
most commonly used, other microbial promoters have been discovered and utilized, and details 
concerning their nucleotide sequences have been published, enabling those of skill in the art to 
ligate them functionally with plasmid vectors. 
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The following details concerning recombinant protein production in bacterial cells, such 
as E. coli, are provided by way of exemplary information on recombinant protein production in 
general, the adaptation of which to a particular recombinant expression system will be known to 
5 those of skill in the art. 

Bacterial cells, for example, E. coU, containing the expression vector are grown in any of 
a number of suitable media, for example, LB. The expression of the recombinant protein may be 
induced, e,g,, by adding IPTG to the media or by switching incubation to a higher temperature. 
10 After culturing the bacteria for a further period, generally of between 2 and 24 hours, the cells 
are collected by centrifugation and washed to remove residual media. 

The bacterial cells are then lysed, for example, by disruption in a cell homogenizer and 
centrifuged to separate the dense inclusion bodies and cell membranes from the soluble cell 
15 components. This centrifugation can be performed under conditions whereby the dense 
inclusion bodies are selectively enriched by incorporation of sugars, such as sucrose, into the 
buffer and centrifugation at a selective speed. 

If the recombinant protein is expressed in the inclusion bodies, as is the case in many 
20 instances, these can be washed in any of several solutions to remove some of the contaminating 
host proteins, then solubilized in solutions containing high concentrations of urea (e.g. 8M) or 
chaotropic agents such as guanidine hydrochloride in the presence of reducing agents, such as B- 
mercaptoethanol or DTT (dithiothreitol). 

25 Under some circumstances, it may be advantageous to incubate the protein for several 

hours under conditions suitable for the protein to undergo a refolding process into a 
conformation which more closely resembles that of the native protein. Such conditions 
generally include low protein concentrations, less than 500 mg/ml, low levels of reducing agent, 
concentrations of urea less than 2 M and often the presence of reagents such as a mixture of 
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reduced and oxidized glutathione which facilitate the interchange of disulfide bonds within the 
protein molecule. 

The refolding process can be monitored, for example, by SDS-PAGE, or with antibodies 
5 specific for the native molecule (which can be obtained from animals vaccinated with the native 
molecule or smaller quantities of recombinant protein). Following refolding, the protein can 
then be purified further and separated from the refolding mixture by chromatography on any of 
several supports including ion exchange resins, gel permeation resins or on a variety of affinity 
colunms. 

10 

For expression in Saccharomyces, the plasmid YRp7, for example, is commonly used. 
This plasmid already contains the trpl gene which provides a selection marker for a mutant strain 
of yeast lacking the ability to grow in tryptophan, for example ATCC No. 44076 or PEP4-1. 
The presence of the trpl lesion as a characteristic of the yeast host cell genome then provides an 
15 effective environment for detecting transformation by growth in the absence of tryptophan. 

Suitable promoting sequences in yeast vectors include the promoters for 
3-phosphoglycerate kinase or other glycol)1:ic enzymes, such as enolase, glyceraldehyde-3- 
phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6- 
20 phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, 
phosphoglucose isomerase, and glucokinase. In constructing suitable expression plasmids, the 
termination sequences associated with these genes are also ligated into the expression vector 3' 
of the sequence desired to be expressed to provide polyadenylation of the mRNA and 
termination. 

25 

Other suitable promoters, which have the additional advantage of transcription controlled 
by growth conditions, include the promoter region for alcohol dehydrogenase 2, isocytochrome 
C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the 
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aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for 
maltose and galactose utilization. 

In addition to microorganisms, cultures of cells derived from multicellular organisms 
5 may also be used as hosts. In principle, any such cell culture is workable, whether from 
vertebrate or invertebrate culture. In addition to mammalian cells, these include insect cell 
systems infected with recombinant virus expression vectors (e.g., baculovirus); and plant cell 
systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression 
10 vectors (e.g., Ti plasmid) containing one or more Toll protein coding sequences. 

In a useful insect system. Autograph califomica nuclear polyhidrosis virus (AcNPV) is 
used as a vector to express foreign genes. The virus grows in Spodoptera frugiperda cells. The 
TLR-4 protein coding sequences are cloned into non-essential regions (for example the 
polyhedrin gene) of the virus and placed under control of an AcNPV promoter (for example the 
polyhedrin promoter). Successful insertion of the coding sequences results in the inactivation of 
the polyhedrin gene and production of non-occluded recombinant virus (i.e., virus lacking the 
proteinaceous coat coded for by the polyhedrin gene). These recombinant viruses are then used 
to infect Spodoptera frugiperda cells in which the inserted gene is expressed (e.g., U.S. Patent 
No. 4,215,051, Smith, incorporated herein by reference). 

Examples of useful mammalian host cell lines are VERO and HeLa cells, Chinese 
hamster ovary (CHO) cell lines, W138, BHK, COS-7, 293, HepG2, 3T3, RIN and MDCK cell 
lines. In addition, a host cell strain may be chosen that modulates the expression of the inserted 
25 sequences, or modifies and processes the gene product in the specific fashion desired. Such 
modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may be 
important for the function of the protein. 
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Different host cells have characteristic and specific mechanisms for the post-translational 
processing and modification of proteins. Appropriate cells lines or host systems can be chosen 
to ensure the correct modification and processing of the foreign protein expressed. 

5 Expression vectors for use in mammalian such cells ordinarily include an origin of 

replication (as necessary), a promoter located in front of the gene to be expressed, along with 
any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional 
terminator sequences. The origin of replication may be provided either by construction of the 
vector to include an exogenous origin, such as may be derived from SV40 or other viral (e.^., 
10 Polyoma, Adeno, VSV, BPV) source, or may be provided by the host cell chromosomal 
replication mechanism. If the vector is integrated into the host cell chromosome, the latter is 
often sufficient. 

The promoters may be derived from the genome of manmialian cells (^.g., 
15 metallothionein promoter) or from manunalian viruses (e,g., the adenovirus late promoter; the 
vaccinia virus 7.5K promoter). Further, it is also possible, and may be desirable, to utilize 
promoter or control sequences normally associated with the desired TLR-4 protein gene 
sequence, provided such control sequences are compatible with the host cell systems. 

20 A number of viral based expression systems may be utilized, for example, commonly 

used promoters are derived from polyoma. Adenovirus 2, and most frequently Simian Virus 40 
(SV40). The early and late promoters of SV40 virus are particularly useful because both are 
obtained easily from the virus as a fragment which also contains the SV40 viral origin of 
replication. Smaller or larger SV40 fragments may also be used, provided there is included the 

25 approximately 250 bp sequence extending from the Hind III site toward the Bgl I site located in 
the viral origin of replication. 

In cases where an adenovirus is used as an expression vector, the coding sequences may 
be ligated to an adenovirus transcription/ translation control complex, e.g., the late promoter and 

-69- 



A: 22S917(4%BH01 !.DOC) 



tripartite leader sequence. This chimeric gene may then be inserted in the adenovirus genome by 
in vitro or in vivo recombination. Insertion in a non-essential region of the viral genome (e.g., 
region El or E3) will result in a recombinant virus that is viable and capable of expressing TLR- 
4 proteins in infected hosts. 

5 

Specific initiation signals may also be required for efficient translation TLR-4 protein 
coding sequences. These signals include the ATG initiation codon and adjacent sequences. 
Exogenous translational control signals, including the ATG initiation codon, may additionally 
need to be provided. One of ordinary skill in the art v/ould readily be capable of determining 
10 this and providing the necessary signals. It is well known that the initiation codon must be in- 
frame (or in-phase) with the reading frame of the desired coding sequence to ensure translation 
of the entire insert. These exogenous translational control signals and initiation codons can be 
of a variety of origins, both natural and synthetic. The efficiency of expression may be enhanced 
by the inclusion of appropriate transcription enhancer elements, transcription terminators. 

15 

In eukaryotic expression, one will also typically desire to incorporate into the 
transcriptional unit an appropriate polyadenylation site (e.g., 5'-AATAAA-3') if one was not 
contained within the original cloned segment. Typically, the poly-A addition site is placed about 
30 to 2000 nucleotides "downstream" of the termination site of the protein at a position prior to 
20 transcription termination. 

For long-term, high-yield production of recombinant TLR-4 proteins, stable expression is 
preferred. For example, cell lines that stably express constructs TLR-4 proteins may be 
engineered. Rather than using expression vectors that contain viral origins of replication, host 
25 cells can be transformed with vectors controlled by appropriate expression control elements 
(e.g., promoter, enhancer, sequences, transcription terminators, polyadenylation sites, etc.), and a 
selectable marker. Following the introduction of foreign DNA, engineered cells may be allowed 
to grow for 1-2 days in an enriched media, and then are switched to a selective media. The 
selectable marker in the recombinant plasmid confers resistance to the selection and allows cells 
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to stably integrate the plasmid into their chromosomes and grow to form foci which in turn can 
be cloned and expanded into cell lines. 

A number of selection systems may be used, including, but not limited, to the herpes 
5 simplex virus (HSV) tk, hypoxanthine-guanine phosphoribosyltransferase (hgprt) and adenine 
phosphoribosyltransferase genes (aprt), in tk-, hgprt- or aprt- cells, respectively. Also, 
antimetabolite resistance can be used as the basis of selection for dhfr, that confers resistance to 
methotrexate; gpt, that confers resistance to mycophenolic acid; neo, that confers resistance to 
the aminoglycoside G-418; and hygro, that confers resistance to hygromycin. 

10 

Animal cells can be propagated in vitro in two modes: as non-anchorage dependent cells 
growing in suspension throughout the bulk of the culture or as anchorage-dependent cells 
requiring attachment to a solid substrate for their propagation {Le,, 3. monolayer type of cell 
growth). 

15 

Non-anchorage dependent or suspension cultures from continuous established cell lines 
are the most widely used means of large scale production of cells and cell products. However, 
suspension cultured cells have limitations, such as tumorigenic potential and lower protein 
production than adherent cells. 

20 

Large scale suspension culture of mammalian cells in stirred tanks is a common method 
for production of recombinant proteins. Two suspension culture reactor designs are in wide use 
- the stirred reactor and the airlift reactor. The stirred design has successfully been used on an 
8000 liter capacity for the production of interferon. Cells are grown in a stainless steel tank with 
25 a height-to-diameter ratio of 1 : 1 to 3: 1 . The culture is usually mixed with one or more agitators, 
based on bladed disks or marine propeller patterns. Agitator systems offering less shear forces 
than blades have been described. Agitation may be driven either directly or indirectly by 
magnetically coupled drives. Indirect drives reduce the risk of microbial contamination through 
seals on stirrer shafts. 
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The airlift reactor, also initially described for microbial fermentation and later adapted 
for mammalian culture, relies on a gas stream to both mix and oxygenate the culture. The gas 
stream enters a riser section of the reactor and drives circulation. Gas disengages at the culture 
5 surface, causing denser liquid free of gas bubbles to travel downward in the downcomer section 
of the reactor. The main advantage of this design is the simplicity and lack of need for 
mechanical mixing. Typically, the height-to-diameter ratio is 10:1. The airlift reactor scales up 
relatively easily, has good mass transfer of gases and generates relatively low shear forces. 

10 It is contemplated that the TLR-4 protein of the invention may be "overexpressed", Le,, 

expressed in increased levels relative to its natural expression in cells. Such overexpression may 
be assessed by a variety of methods, including radio-labeling and/or protein purification. 
However, simple and direct methods are preferred, for example, those involving SDS/PAGE and 
protein staining or western blotting, followed by quantitative analyses, such as densitometric 

15 scanning of the resultant gel or blot. A specific increase in the level of the recombinant protein 
or peptide in comparison to the level in natural cells is indicative of overexpression, as is a 
relative abundance of the specific protein in relation to the other proteins produced by the host 
cell and, e.g., visible on a gel. 

20 H. Kits 

All the essential materials and reagents required for detecting TLR-4 polynucleotides or 
polypeptides may be assembled together in a kit. When the components of the kit are provided 
in one or more liquid solutions, the liquid solution preferably is an aqueous solution, with a 
sterile aqueous solution being particularly preferred. Each component preferably is supplied in a 
25 separate container. 

For therapeutic uses, a polynucleotide or candidate substance, as identified according to 
the methods disclosed herein, may be formulated into a single or separate pharmaceutically 
acceptable syringeable composition. In this case, the container means may itself be an inhalant, 
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syringe, pipette, eye dropper, or other such hke apparatus, from which the formulation may be 
apphed to an infected area of the body, injected into an animal, or even applied to and mixed 
with the other components of the kit. 

5 The components of these kits may also be provided in dried or lyophilized forms. When 

reagents or components are provided as a dried form, reconstitution generally is by the addition 
of a suitable solvent. It is envisioned that the solvent also may be provided in another container 
means. The kits of the invention may also include an instruction sheet defining administration 
of active compounds or explaining assays for detecting TLR-4 or TLR-4 mutants in samples, 

10 

The kits of the present invention also will typically include a means for containing the 
vials in close confinement for conmiercial sale such as, e.g., injection or blow-molded plastic 
containers into which the desired vials are retained. Irrespective of the number or type of 
containers, the kits of the invention also may comprise, or be packaged with, an instrument for 
15 assisting with the injection/administration or placement of the ultimate complex composition 
within the body of an animal. Such an instrument may be an inhalant, syringe, pipette, forceps, 
measured spoon, eye dropper or any such medically approved delivery vehicle. Other 
instrumentation includes devices that permit the reading or monitoring of reactions in vitro, 

20 I. Biological Functional Equivalents 

As will be understood by those of skill in the art, modification and changes may be made 
in the structure of the TLR-4 protein and subunits and still obtain a molecule having like or 
otherwise desirable characteristics. For example, certain amino acids may be substituted for 
other amino acids in a protein structure without appreciable loss of interactive binding capacity, 
25 Since it is the interactive capacity and nature of a protein that defines that protein's biological 
functional activity, certain amino acid sequence substitutions can be made in a protein sequence 
(or, of course, its underlying DNA coding sequence) and nevertheless obtain a protein with like 
(agonistic) properties. It is thus contemplated that various changes may be made in the sequence 
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of TLR-4 proteins or subunits (or underlying DNA) without appreciable loss of their biological 
utility or activity. 

Equally, the same considerations may be employed to create a TLR-4 protein or subunit 
5 with counterveiling (e.g., antagonistic) properties. This is relevant to the present invention in 
which TLR-4 analogues without endotoxin recognition activity are contemplated to be useful in 
inhibiting the secretion of TNF. 

In terms of functional equivalents, it is also well understood by the skilled artisan that, 
10 inherent in the definition of a biologically functional equivalent protein or peptide, is the concept 
that there is a limit to the number of changes that may be made within a defined portion of the 
molecule and still result in a molecule with an acceptable level of equivalent biological activity. 
Biologically functional equivalent peptides are thus defined herein as those peptides in which 
certain, not most or all, of the anoino acids may be substituted. In particular, where small 
15 peptides are concerned, less amino acids may be changed. Of course, a plurality of distinct 
proteins/peptides with different substitutions may easily be made and used in accordance with 
the invention. 

It is also well understood that where certain residues are shown to be particularly 
20 important to the biological or structural properties of a protein or peptide, ^.g., residues in the 
active site of an enzyme or to maintain protein function, 

Amino acid substitutions are generally based on the relative similarity of the amino acid 
side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the 
25 like. An analysis of the size, shape and type of the amino acid side-chain substituents reveals 
that arginine, lysine and histidine are all positively charged residues; that alanine, glycine and 
serine are all a similar size; and that phenylalanine, tryptophan and tyrosine all have a generally 
similar shape. Therefore, based upon these considerations, arginine, lysine and histidine; 
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alanine, glycine and serine; and phenylalanine, tryptophan and tyrosine; are defined herein as 
biologically functional equivalents. 

To effect more quantitative changes, the hydropathic index of amino acids may be 
5 considered. Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics, these are: isoleucine (+4.5); valine (+4.2); leucine 
(+3.8); phenylalanine (+2.8); cysteine/cystine (+2,5); methionine (+1.9); alanine (+1.8); glycine 
(-0.4); threonine (-0.7); serine (-0,8); tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); histidine 
(-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and 
10 arginine (-4.5). 

The importance of the hydropathic amino acid index in conferring interactive biological 
function on a protein is generally understood in the art (Kyte & Doolittle, 1982, incorporated 
herein by reference). It is known that certain amino acids may be substituted for other amino 
15 acids having a similar hydropathic index or score and still retain a similar biological activity. In 
making changes based upon the hydropathic index, the substitution of amino acids whose 
hydropathic indices are within ±2 is preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more particularly preferred, 

20 It is also understood in the art that the substitution of like amino acids can be made 

effectively on the basis of hydrophilicity, particularly where the biological functional equivalent 
protein or peptide thereby created is intended for use in immunological embodiments, as in the 
present case, U.S. Patent 4,554,101, incorporated herein by reference, states that the greatest 
local average hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino 

25 acids, correlates with its immunogenicity and antigenicity, i.e. with a biological property of the 
protein. 

As detailed in U.S, Patent 4,554,101, the following hydrophilicity values have been 
assigned to amino acid residues: arginine (+3,0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate 
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(+3.0 ± 1); serine (+03); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
proline (-0.5 ± 1); alanine (-0,5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); 
leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

5 In making changes based upon similar hydrophilicity values, the substitution of amino 

acids whose hydrophilicity values are within ±2 is preferred, those which are within ±1 are 
particularly preferred, and those within ±0,5 are even more particularly preferred. 

While discussion has focused on functionally equivalent polypeptides arising from amino 
10 acid changes, it will be appreciated that these changes may be effected by alteration of the 
encoding DNA; taking into consideration also that the genetic code is degenerate and that two or 
more codons may code for the same amino acid. A table of amino acids and their codons is 
presented hereinabove for use in such embodiments, as well as for other uses, such as in the 
design of probes and primers and the like. 

15 

L Mutagenesis 

Site-specific mutagenesis is a technique useful in the preparation of individual peptides, 
or biologically functional equivalent proteins or peptides, through specific mutagenesis of the 
underlying DNA, The technique further provides a ready ability to prepare and test sequence 

20 variants, incorporating one or more of the foregoing considerations, by introducing one or more 
nucleotide sequence changes into the DNA. Site-specific mutagenesis allows the production of 
mutants through the use of specific oligonucleotide sequences which encode the DNA sequence 
of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a 
primer sequence of sufficient size and sequence complexity to form a stable duplex on both 

25 sides of the deletion junction being traversed. Typically, a primer of about 17 to 25 nucleotides 
in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence 
being altered. 
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In general, the technique of site-specific mutagenesis is well known in the art. As will be 
appreciated, the technique typically employs a bacteriophage vector that exists in both a single 
stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include 
vectors such as the M 13 phage. These phage vectors are commercially available and their use is 
5 generally well known to those skilled in the art. Double stranded plasmids are also routinely 
employed in site directed mutagenesis, which eliminates the step of transferring the gene of 
interest from a phage to a plasraid. 

In general, site-directed mutagenesis is performed by first obtaining a single-stranded 
10 vector, or melting of two strands of a double stranded vector which includes within its sequence 
a DNA sequence encoding the desired protein. An oligonucleotide primer bearing the desired 
mutated sequence is synthetically prepared. This primer is then annealed with the single- 
stranded DNA preparation, and subjected to DNA polymerizing enzymes such as E. coli 
polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing 
15 strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated 
sequence and the second strand bears the desired mutation. This heteroduplex vector is then used 
to transform appropriate cells, such as E, coli cells, and clones are selected that include 
recombinant vectors bearing the mutated sequence arrangement. 

20 The preparation of sequence variants of the selected gene using site-directed mutagenesis 

is provided as a means of producing potentially useful species and is not meant to be limiting, as 
there are other ways in which sequence variants of genes may be obtained. For example, 
recombinant vectors encoding the desired gene may be treated with mutagenic agents, such as 
hydroxylamine, to obtain sequence variants. 

25 

J. Inhibitors, Stimulators and Screening Assays 

In still further embodiments, the present invention provides methods for identifying new 
TLR-4 inhibitory or stimulatory compounds, which may be termed as "candidate substances." It 
is contemplated that such screening techniques will prove useful in the general identification of 
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any compound that will serve the purpose of inhibiting or stimulating TLR-4 directed signaling 
of TNF secretion, and in preferred embodiments, will provide candidate compounds. 

It is further contemplated that useful compounds in this regard will in no way be limited 
5 to proteinaceous or peptidyl compounds. In fact, it may prove to be the case that the most useful 
pharmacological compounds for identification through application of the screening assays will 
be non-peptidyl in nature and, e.g., which will serve to inhibit TNF secretion through a tight 
binding or other chemical interaction. Candidate substances may be obtained from libraries of 
synthetic chenaicals, or from natural samples, such as rain forest and marine samples. 



The present invention provides methods of screening for modulators of LPS mediated 
response by monitoring the standard activity profile of TLR-4 in the presence and absence of the 
candidate substance and comparing such results. It is contemplated that this screening technique 
15 will prove useful in the general identification of a compound that will serve the purpose of 
promoting, augmenting or increasing the activity of TLR-4 of a macrophage cell. Such 
compounds will be useful in the treatment of various bacterial infections. 

In these embodiments, the present invention is directed to a method for determining the 
20 ability of a candidate substance to stimulate the wild-type TLR-4 of cells that either naturally 
express TLR-4 or have been engineered to express TLR-4 as described herein. The method 
including generally the steps of: 



10 



Assay Formats 



25 



(iv) 



(i) 
(ii) 



(iii) 



providing a cell expressing a TLR-4 polypeptide; 

determining the activity of said TLR-4 polypeptide; and 

contacting said cell with a candidate substance; and 

comparing the TLR-4 activity of the cell in step (iii) with the TLR-4 

activity observed when said candidate substance is not added. 
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wherein an alteration in the activity indicates that said candidate substance is a modulator of said 
apoptotic activity. 

To identify a candidate substance as being capable of stimulating TLR-4 in the assay 
5 above, one would measure or determine the activity in the absence of the added candidate 
substance. One would then add the candidate substance to the cell and determine the activity in 
the presence of the candidate substance, A candidate substance which increases the activity or 
capacity relative to activity observed in its absence is indicative of a candidate substance with 
stimulatory capability. 

10 

In particular embodiments, any compound that stimulates the production of IFN or 
related cytokines and mediates the inflammatory response to LPS or LPS containing moieties 
(e,g., Gram negative bacteria). As stated above, a "candidate substance" refers to any molecule 
that is capable of modulating the activity of TLR-4. The candidate substance may be a protein 
15 or fragment thereof, a small molecule inhibitor, or even a nucleic acid molecule. Accordingly, 
the active compounds may include fragments or parts of naturally-occurring compounds or may 
be found as active combinations of known compounds which are otherwise inactive. 

Accordingly, the present invention provides screening assays to identify agents which 
20 stimulate a cellular immune and/or response, it is proposed that compounds isolated from natural 
sources, such as animals, bacteria, fungi, plant sources, including leaves and bark, and marine 
samples may be assayed as candidates for the presence of potentially useful phamiaceutical 
agents. It will be understood that the pharmaceutical agents to be screened could also be derived 
or synthesized from chenaical compositions or man-made compounds. 

25 

Thus, it is understood that the candidate substance identified by the present invention 
may be polypeptide, polynucleotide, small molecule inhibitors or any other compounds that may 
be designed through rational drug design starting from known stimulators of inmiune and/or 
inflammatory response. For example, Barrett et aL, describe peptides and compounds that bind 
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the IL-1 receptor (U.S Patent 5,786,331, specifically incorporated herein by reference) and U.S. 
Patent 5,508,262 describes an IL-1 receptor agonist. It may be that, as TLR-4 is an analogue of 
the IL-1 receptor, agonists of IL-1 may also be agonists of TLR-4, Alternatively, known 
modulators of IL-1 receptors may prove to be a useful starting point in a rational drug design 
5 strategy that will yield experimentally, therapeutically or clinically relevant compounds that 
modulate the activity of TLR-4 and the inamune response that TLR-4 mediates. 

The candidate screening assays are simple to set up and perform. Thus, in assaying for a 
candidate substance, after obtaining a cell expressing functional TLR-4, one will admix a 
10 candidate substance with the cell, under conditions which would allow measurable TNF 
secretion to occur. In this fashion, one can measure the ability of the candidate substance to 
stimulate the TNF secretory response of the cell in the absence of the candidate substance. One 
would then measure the response in the presence of the candidate substance and determine the 
effect of the candidate substance. 

15 

"Effective amounts" in certain circumstances are those amounts effective to reproducibly 
stimulate the TNF secretory (inflanmiatory) response from the cell in comparison to their normal 
levels. Compounds that achieve significant appropriate changes in activity will be used. 

20 Significant changes in inflammatory response, e.g., as measured TNF production, 

splenocyte activity and the like are represented by an increase/decrease in the response of at least 
about 30%-40%, and most preferably, by changes of at least about 50%, with higher values of 
course being possible. 

25 It will, of course, be understood that all the screening methods of the present invention 

are useful in themselves notwithstanding the fact that effective candidates may not be found. The 
invention provides methods for screening for such candidates, not solely methods of finding 
them. 

-80- 



A: 225917(4%BH0U.DOC) 



it Rational Drug Design 
The goal of rational drug design is to produce structural analogs of biologically active 
polypeptides or compounds with which they interact (agonists, antagonists, inhibitors, binding 
partners, eta). By creating such analogs, it is possible to fashion drugs which are more active or 
5 stable than the natural molecules, which have different susceptibility to alteration or which may 
affect the function of various other molecules. In one approach, one would generate a three- 
dimensional structure for TLR-4 or a fragment thereof. This could be accomplished by x-ray 
crystallograph, computer modeling or by a combination of both approaches. An altemative 
approach, "alanine scan," involves the random replacement of residues throughout molecule 
10 with alanine, and the resulting affect on function determined. 

It also is possible to isolate a specific antibody, selected by a functional assay, and then 
solve its crystal structure. In principle, this approach yields a pharmacore upon which 
subsequent drug design can be based. It is possible to bypass protein crystallograph altogether 

15 by generating anti-idiotypic antibodies to a functional, pharmacologically active antibody. As a 
mirror image of a mirror image, the binding site of anti-idiotype would be expected to be an 
analog of the original antigen. The anti-idiotype could then be used to identify and isolate 
peptides from banks of chemically- or biologically-produced peptides. Selected peptides would 
then serve as the pharmacore. Anti-idiotypes may be generated using the methods described 

20 herein for producing antibodies, using an antibody as the antigen. 

Thus, one may design drugs which have activity as stimulators, inhibitors, agonists, 
antagonists of TLR-4 or molecules affected by TLR-4 function. Such rational drug design may 
start with lead compounds already known to those of skill in the art. By virtue of the availability 
25 of cloned TLR-4 sequences, sufficient amounts of these proteins can be produced to perform 
crystallographic studies. In addition, knowledge of the polypeptide sequences permits computer 
employed predictions of structure-function relationships. Particularly useful agents that may be 
found by the present invention will be those agents that stimulate the TLR-4 receptor response to 
LPS and therefore increase the immune attack against LPS. Such agents may bind to the TLR-4. 
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K Formulations and Routes for Administration to Patients 

Where clinical applications are contemplated, it will be necessary to prepare the 
expression vectors or candidate substances of the present invention as pharmaceutical 
5 compositions, Le,, in a form appropriate for in vivo applications. Generally, this will entail 
preparing compositions that are essentially free of pyrogens, as well as other impurities that 
could be harmful to humans or animals. 

One will generally desire to employ appropriate salts and buffers to render delivery 
10 vectors stable and allow for uptake by target cells. Buffers also will be employed when 
recombinant cells are introduced into a patient* Aqueous compositions of the present invention 
comprise an effective amount of the vector to cells, dissolved or dispersed in a pharmaceutically 
acceptable carrier or aqueous medium. Such compositions also are referred to as inocula. The 
phrase "pharmaceutically or pharmacologically acceptable" refer to molecular entities and 
15 compositions that do not produce adverse, allergic, or other untoward reactions when 
administered to an animal or a human. As used herein, "pharmaceutically acceptable carrier" 
includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, 
isotonic and absorption delaying agents and the like. The use of such media and agents for 
pharmaceutically active substances is well know in the art. Except insofar as any conventional 
20 media or agent is incompatible with the vectors or cells of the present invention, its use in 
therapeutic compositions is contemplated. Supplementary active ingredients also can be 
incorporated into the compositions. 

The active compositions of the present invention include classic pharmaceutical 
25 preparations. Administration of these compositions according to the present invention will be 
via any common route so long as the target tissue is available via that route. This includes oral, 
nasal, buccal, rectal, vaginal or topical. Alternatively, administration may be by orthotopic, 
intradermal, subcutaneous, intramuscular, intraperitoneal or intravenous injection. Such 
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compositions would normally be administered as pharmaceutically acceptable compositions, 
described supra. 

The active compounds may be administered via any suitable route, including parenterally 
5 or by injection. Solutions of the active compounds as free base or pharmacologically acceptable 
salts can be prepared in water suitably mixed with a surfactant, such as hydroxypropylcellulose. 
Dispersions also can be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof 
and in oils. Under ordinary conditions of storage and use, these preparations contain a 
preservative to prevent the growth of microorganisms, 

10 

The pharmaceutical forms suitable for injectable use include sterile aqueous solutions or 
dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions 
or dispersions. In all cases the form must be sterile and must be fluid to the extent that easy 
syringability exists. It must be stable under the conditions of manufacture and storage and must 

15 be preserved against the contaminating action of microorganisms, such as bacteria and fungi. 
The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, 
polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), 
suitable mixtures thereof, and vegetable oils. The proper fluidity can be maintained, for 
example, by the use of a coating, such as lecithin, by the maintenance of the required particle 

20 size in the case of dispersion and by the use of surfactants. The prevention of the action of 
microorganisms can be brought about by various antibacterial an antifungal agents, for example, 
parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it will be 
preferable to include isotonic agents, for example, sugars or sodium chloride. Prolonged 
absorption of the injectable compositions can be brought about by the use in the compositions of 

25 agents delaying absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions are prepared by incorporating the active compounds in the 
required amount in the appropriate solvent with various of the other ingredients enumerated 
above, as required, followed by filtered sterilization. Generally, dispersions are prepared by 
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incoq)orating the various sterilized active ingredients into a sterile vehicle which contains the 
basic dispersion medium and the required other ingredients from those enumerated above. In the 
case of sterile powders for the preparation of sterile injectable solutions, the preferred methods 
of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the 
5 active ingredient plus any additional desired ingredient from a previously sterile-filtered solution 
thereof. 

As used herein, "pharmaceutically acceptable carrier" includes any and all solvents, 
dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying 
10 agents and the like. The use of such media and agents for pharmaceutical active substances is 
well known in the art. Except insofar as any conventional media or agent is incompatible with 
the active ingredient, its use in the therapeutic compositions is contemplated. Supplementary 
active ingredients also can be incorporated into the compositions. 

15 For oral administration the polypeptides of the present invention may be incorporated 

with excipients and used in the form of non-ingestible mouthwashes and dentifrices. A 
mouthwash may be prepared incorporating the active ingredient in the required amount in an 
appropriate solvent, such as a sodium borate solution (DobelFs Solution). Altematively, the 
active ingredient may be incorporated into an antiseptic wash containing sodium borate, glycerin 

20 and potassium bicarbonate. The active ingredient may also be dispersed in dentifrices, 
including: gels, pastes, powders and slurries. The active ingredient may be added in a 
therapeutically effective amount to a paste dentifrice that may include water, binders, abrasives, 
flavoring agents, foaming agents, and humectants. 

25 The compositions of the present invention may be formulated in a neutral or salt form. 

Pharmaceutically-acceptable salts include the acid addition salts (formed with the free amino 
groups of the protein) and which are formed with inorganic acids such as, for example, 
hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and 
the like. Salts formed with the free carboxyl groups also can be derived from inorganic bases 
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such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such 
organic bases as isopropylamine, trimethylamine, histidine, procaine and the like. 

Upon formulation, solutions will be administered in a manner compatible with the 
5 dosage formulation and in such amount as is therapeutically effective. The formulations are 
easily administered in a variety of dosage forms such as injectable solutions, drug release 
capsules and the like. For parenteral administration in an aqueous solution, for example, the 
solution should be suitably buffered if necessary and the liquid diluent first rendered isotonic 
with sufficient saline or glucose. These particular aqueous solutions are especially suitable for 
10 intravenous, intramuscular, subcutaneous and intraperitoneal administration. 

In this connection, sterile aqueous media which can be employed will be known to those 
of skill in the art in light of the present disclosure. For example, a unit dose could be dissolved 
in 1 ml of isotonic NaCl solution and either added to 1000 ml of hypodermoclysis fluid or 

15 injected at the proposed site of infusion, (see for example, "Remington's Pharmaceutical 
Sciences" 15th Edition, pages 1035-1038 and 1570-1580). Some variation in dosage will 
necessarily occur depending on the condition of the subject being treated. The person 
responsible for administration will, in any event, determine the appropriate dose for the 
individual subject. Moreover, for human administration, preparations should meet sterility, 

20 pyrogenicity, general safety and purity standards as required by FDA Office of Biologies 
standards. 

L. Examples 

The following examples are included to demonstrate preferred embodiments of the 
25 invention. It should be appreciated by those of skill in the art that the techniques disclosed in the 
examples which follow represent techniques discovered by the inventor to function well in the 
practice of the invention, and thus can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
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that many changes can be made in the specific embodiments which are disclosed and still obtain 
a like or similar result without departing from the spirit and scope of the invention. 

EXAMPLE 1 

5 Materials And Methods 

Mice 

[SWR X C3H/HeJ] x C3H/HeJ F2 animals, SWR mice were obtained from Jackson 
Laboratories, and maintained in the Animal Resource Center of the University of TX, 
Southwestem Medical Center. C3H/HeJ mice were also obtained from Jackson Laboratories 
10 (Bar Harbor, ME), Fl mice obtained by the cross of C3H/HeJ males to SWR females were 
backcrossed at eight weeks of age to C3H/HeJ animals of both sexes. 493 F2 animals were used 
at six to eight weeks of age for analysis of LPS response phenotype and preparation of genomic 
DNA. In this panel of animals, both splenocyte responses and macrophage TNF production 
were assayed as endpoints of endotoxin response determination. 

15 

[C57BU6 X C3H/HeJ] x C3H/HeJ F2 animals. 1600 F2 mice were produced at the 
Jackson Laboratories, and shipped to the Animal Resource Center of the University of TX, 
Southwestem Medical center at 5 to 8 weeks of age. Animals were immediately ear-tagged for 
identification and tail cuttings were taken for identification. For this panel of animals, selection 
20 of individuals with recombination between markers D4Mm64 and D4MIT27 was made by 
PCR analysis. All but a few non-recombinants were sacrificed; the remaining non-recombinants 
were used as controls (obligate Lps^ heterozygotes or homozygotes with representative genetic 
variability at other loci) with each assay series. In this panel of animals, macrophage TNF 
production was used as the sole endpoint of endotoxin response determination. 

25 

Preparation of genomic DNA 

Tail fragments approximately 5 mm in length were placed in 1.5 ml microcentrifuge 
tubes containing 650 jil of SSTE buffer, in which proteinase K had been dissolved at a 
concentration of 1 mg/ml. SSTE buffer was made by bringing 10 ml of 10% SDS, 2 ml of 5 M 
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NaCl, 5 ml of IM tris, pH 8.0, and 3 ml of 0.5 M EDTA, pH 8.0 to a final volume of 100 ml 
with distilled water. The tissue was digested overnight at 55 ""C. 30 \il of RNase Tl (5 
was then added to each tube, and a second incubation was carried out at 37 °C for one hour. 
DNA was then extracted with phenol, 1:1 phenol/chloroform solution, and finally chloroform. 
5 After addition of 1/10 volume of 3M sodium acetate solution, samples were precipitated with 
two volumes of ethanol. Pellets were dried and redissolved in 300 |al of TE solution. 

Assays for LPS responsiveness 

Splenocyte proliferation assays were used to type all 493 animals produced by the 

10 backcross of C3H/HeJ mice to SWR mice, and were performed by harvesting 30 to 40 million 
spleen cells from each animal. Erythrocytes were lysed with ammonium chloride, and the 
remaining cell population, predominantly lymphocytes, were plated at a density of 1 million 
cells per well, in triplicate, in round bottom 96-well plates. Cells were stimulated with Con A as 
a control for viability, and also with LPS at two concentrations (1.0 |iig/ml and 5.0 |ig/ml) for a 

15 period of 72 hours, in the presence of tritiated thymidine. All samples showing Con A 
responses that exceeded 100,000 CPM were retained for analysis; those occasional samples that 
had smaller Con A responses were discarded from consideration, A cell harvester was used to 
collect the lymphocytes, and measurements of thymidine incorporation were made. An index of 
LPS responsiveness (J) was calculated for each sample as follows: 

20 

I = [mean CPM of LPS-induced splenocytes]/[mean CPM of non-induced splenocytes] 

For simplicity, only the data pertaining to stimulation with 5 [ig/ml LPS are reported in 
this paper. 

25 

TNF production by peritoneal macrophages was used to type all 1600 animals 
produced by the backcross of C3H/HeJ mice to C57BL/6 mice. TNF mediated cytolytic activity 
is released by LPS-responsive cells following endotoxin stimulation in vitro, and was used to 
quantitate LPS sensitivity. Eight-week-old mice were injected intraperitoneally with sterile 
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Brewer's thioglycollate broth (3.0 ml). 4 days later, peritoneal exudate macrophages were 
harvested by peritoneal lavage. 2x10^ cells from each mouse were plated in 24 well plates, and 
after adherence, stimulated by LPS at concentrations of 0.01 and 0,5 |Lig/ml. After 16 hours, the 
medium was harvested and assayed for TNF content. TNF assays were performed using the 
5 L929 cell method, in which cells were plated in 96 well plates, exposed to the macrophage 
medium, and after rinsing, stained with crystal violet. The intensity of staining, measured in a 
microplate reader at 590 nm, corresponded to the fraction of surviving cells (Cseh and Beutler, 
1989). Results were expressed as % cytotoxicity, defined as: 

10 [1- OD[sampIewell]/OD[control] ] X 100 

Both the splenocyte proliferation assay and the macrophage response assay permitted 
unambiguous discrimination between responders (heterozygotes for the Lps^ allele) and 
nonresponders (homozygotes for the Lps^ allele), but the macrophage response assay showed 
15 less day-to-day variability and was more convenient to perform. 

Contig Assembly. 

YACs were isolated from the Research Genetics mouse YAC library by PCR-based 
screening of a matrix consisting of superpools, pools, and plates. The probes initially employed 

20 in this process were the map-pair markers: D4MIT82, D4MIT218, D4MIT325, D4MIT244, 
D4MIT25, NdS9, D4Mm78, D4MIT7, D4Mm32, and D4Mrr83. A total of 20YACs were 
isolated in this manner. All were sized by pulsed-field gel electrophoresis. The ends of all of 
the YACs were checked by sequencing. In some cases, continuity with adjoining BACs was 
immediately established. When highly repetitive sequence was present, internal sequences were 

25 determined by cloning rather large fragments of the YAC that contained the end. If when no 
overlap was established, additional YACs were isolated using the end sequence. In several 
instances, after two such cycles of chromosome walking, it was concluded that the original 
YACs were chimeric. 
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Unique sequences derived from shotgun cloning of YACs, as well as the above 
mentioned D4MIT markers, were used to identify BAG clones, chiefly by screening the 
Research Genetics mouse BAG library. Two additional clones (BAGs 84G8 and 389F15) were 
obtained by screening BAG libraries maintained at Genome Systems (St. Louis, MO). A 
5 contiguous span of BAGs crossing most of the interval was produced, as in the case of the YAG 
contig, by chromosome walking. Among the BAGs that were isolated, only two proved to be 
chimeric; the chimeric BAGs are not presented in any of the Figures. 



Novel marker identification, 

10 YAGs were fragmented by ultrasonic shearing and subcloned for internal sequence 

analysis. Both random, non-repetitive sequences and novel microsatellite repeats were isolated 
in this manner. More than 100 such markers were identified across the region in question. 
These markers were used for PCR-based screening of a mouse BAG library from Research 
Genetics (Birmingham, AL), and permitted the isolation of BAG clones. The PGR primers 

15 defining all markers relevant to the Figures are presented in Table I. 
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Shotgun sequencing* 

Using a minimal contig containing 24 BACs and one YAC, which together encompassed 
>95% of the interval between markers B and 83.3, approximately 20 million bases of high- 
5 quality sequence were obtained, from bidirectional reads of approximately 20,000 fragments of 
DNA (i.e., approximately 40,000 reads in all). In this process, all BACs and the YAC clone 
100E4 were fragmented by ultrasonic shearing, repaired to a blunt-ended state using Klenow 
fragment or mung bean nuclease, and subcloned into the vector pBluescript, which was been cut 
with Smal and treated with calf intestine alkaline phosphatase to prevent self ligation. The 
10 average fragment size was about 1.5 kb. Automated sequencing was performed using ABI model 
377A sequencers, and chain termination chemistry was used by all four of these sequencing 
laboratories. 

Bioinformatic analysis. 

15 All sequences were hand-edited or processed by Phred and Phrap (obtained from Brent 

Ewing and Phil Green, respectively; University of Washington Genome Center) to remove 
vector, and to assemble as completely as possible. Individual reads were stored as a growing 
database in a single large directory, and subjected to the following tests, all of which were carried 
out using Genetics Computing Group (GCG) software, or via web or e-mail servers: 

20 

1. All sequences were periodically compared to all other sequences in the database using 
the Fasta search algorithm, in order to establish and display regions of overlap and homology. 

2. All sequences were masked to hide conmion repetitive elements from consideration 
25 prior to searches against the large sequence databases listed below. Censoring was usually 

carried out using the program RepeatMasker (obtained from Arian Smit, University of 
Washingtion). 
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3. All sequences, in censored and uncensored form, were subjected to blastn analysis 
using the genembl, dbest, and HCD databases, the latter at levels I and II. 

4. All sequences, in censored (masked) and uncensored form, were subjected to blastx 
5 analysis using the genembl and HCD databases, the latter at levels I and H. 

5. In some select instances, sequences were studied locally using the framesearch 
algorithm to detect open reading frames with homology to components of the Swissprot 
database. 

10 

6. All sequences were analyzed for rare peptide motifs by translating them in all six 
frames and subjecting them to a local Motif search. 

7. All sequences were, at various stages, aligned with others using the GCG program 
15 Gelmerge, or using the program Phrap. In this manner, long contiguous sequences were obtained 

and unambiguous overlap between adjacent BACs, as well as the approximate extent of overlap, 
could be inferred. 

8. Individual sequencing reads, or when possible, contigs of reads, were subjected to 
20 analysis by GRAIL 2 in order to identify putative exons, 

Exon trapping. 

A total of 169 exons were trapped from the "zero area" of the critical region. Exon 
trapping was performed using the vector pSPL3 (Bum et aL, 1995). Digested genomic DNA was 
25 ligated into the vector, and pooled clones (generally 20 to 100 at one time) were transfected into 
COS -7 cells, obtained from the ATCC. Capture of putative exons was accomplished by PCR. 
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Shotgun sequencing. 

Using a minimal contig containing 24 BACs and one YAC, which together encompassed 
>95% of the interval between markers B and 83.3, approximately 20 million bases of high- 
5 quality sequence were obtained, from bidirectional reads of approximately 20,000 fragments of 
DNA (i.e., approximately 40,000 reads in all). In this process, all BACs and the YAC clone 
100E4 were fragmented by ultrasonic shearing, repaired to a blunt-ended state using Klenow 
fragment or mung bean nuclease, and subcloned into the vector pBluescript, which was been cut 
with Smal and treated with calf intestine alkalme phosphatase to prevent self ligation. The 
10 average fragment size was about 1.5 kb. Automated sequencing was performed using ABI model 
377A sequencers, and chain termination chemistry was used by all four of these sequencing 
laboratories. 

Bioinformatic analysis. 

15 All sequences were hand-edited or processed by Phred and Phrap (obtained from Brent 

Ewing and Phil Green, respectively; University of Washington Genome Center) to remove 
vector, and to assemble as completely as possible. Individual reads were stored as a growing 
database in a single large directory, and subjected to the following tests, all of which were carried 
out using Genetics Computing Group (GCG) software, or via web or e-mail servers: 

20 

1. All sequences were periodically compared to all other sequences in the database using 
the Fasta search algorithm, in order to establish and display regions of overlap and homology. 

2. All sequences were masked to hide common repetitive elements from consideration 
25 prior to searches against the large sequence databases listed below. Censoring was usually 

carried out using the program RepeatMasker (obtained from Arian Smit, University of 
Washingtion). 
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3. All sequences, in censored and uncensored form, were subjected to blastn analysis 
using the genembl, dbest, and HCD databases, the latter at levels I and n. 

4. All sequences, in censored (masked) and uncensored form, were subjected to blastx 
5 analysis using the genembl and HCD databases, the latter at levels I and H. 

5. In some select instances, sequences were studied locally using the framesearch 
algorithm to detect open reading frames with homology to components of the Swissprot 
database. 

10 

6. All sequences were analyzed for rare peptide motifs by translating them in all six 
frames and subjecting them to a local Motif search. 

7. All sequences were, at various stages, aligned with others using the GCG program 
15 Gelmerge, or using the program Phrap. In this manner, long contiguous sequences were obtained 

and unambiguous overlap between adjacent BACs, as well as the approximate extent of overlap, 
could be inferred, 

8. Individual sequencing reads, or when possible, contigs of reads, were subjected to 
20 analysis by GRAIL 2 in order to identify putative exons. 

Exon trapping. 

A total of 169 exons were trapped from the "zero area" of the critical region. Exon 
trapping was performed using the vector pSPL3 (Bum et al, 1995). Digested genomic DNA was 
25 ligated into the vector, and pooled clones (generally 20 to 100 at one time) were transfected into 
COS-7 cells, obtained from the ATCC. Capture of putative exons was accomplished by PGR. 
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Hybridization selection of expressed cDNAs using BACs, 

This method was carried out as described by Rommens et aL (1998), using cDNA from 
RAW 264.7 cells (obtained from the ATCC). A total of 538 selected clones from the "zero area" 
were examined in the course of the present study. 

5 

Fluorescence in situ hybridization* 

Six separate distance measurements were made at Genome Systems, Inc., using 
interphase nuclei of mouse cells. The measurements included distances between the following 
BACs, which were used as probes: 
10 [L22 and 012], [012 and N13], [N13 and B6], [B6 and B22], [B22 and K4], [B22 and 

L15]. 

The minimum distance that may be measured through this approach is 0.2 Mb, and the 
maximum distance that may be measured is 1.6 Mb. A 10% margin of error is certified for all 
15 measurements. 

Genetic computation* 

A 500 Mhz DECa computer (DCG Viper EV-56) equipped with 63 Gb of hard drive 
space and 256 Mb of RAM was used in all of these studies. Programs were run under a Digital 
20 UNIX operating system. For the design of primer pairs to be used in PGR and sequence 
extension, the program Gene Runner was used (Hastings Software, Inc). The graphics program 
used in depiction of the contig was Hijaak Draw (Inset Graphics, Inc), Each of the latter two 
programs was run under Windows95 using a Pentium Pro computer from Gateway, Inc. 

25 EXAMPLE 2 

LPS Response Assays and Validation Thereof 

Both splenocyte proliferation assays and assays of TNF production were used to 
distinguish LPS nonresponder mice (Lps^ homozygotes) LPS responders (Lps^ heterozygotes) 
were used in analysis of the [SWR x C3H/HeJ]Fl x C3H/HeJ paneL Both assays were carried 
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out using two separate LPS concentrations. When splenocyte proliferation was used as an 
endpoint, a simultaneous assay of proliferation occurring in response to concanavalin A was used 
as a means of controlling for cell viability. In the [SWR x C3H/HeJ]Fl x C3H/HeJ panel, 
conflict between the two assays was rarely observed, and mice that were felt to be ambiguous in 
5 their responsiveness to LPS were discarded from further consideration. 

Experience with the two assays of LPS response showed that measurement of TNF 
production occurring at low (10 ng/ml) concentrations of LPS was a more reliable index of 
responsiveness than measurement of splenocyte proliferation. Therefore, this assay was used 

10 exclusively in analysis of the [C57BL/6 x C3H/HeJ]Fl x C3H/HeJ backcross panel. Since not 
all of the potential recombinants were assayed for LPS responsiveness on the same day, and since 
day-to-day variability in TNF measurement might confuse interpretation of the results, a series of 
obligate Lps^ allele homozygotes and heterozygotes were identified by analysis of flanking 
markers, and included as controls with each assay performed. The distinction between 

15 responder and non-responder control groups was in every instance clear and unambiguous. 
Similarly, the designation of each individual recombinant as homozygous or heterozygous with 
respect to the Lps locus was accomplished without ambiguity. 

EXAMPLE 3 

20 [SWR X C3H/HeJ] x C3H/HeJ Backcross 

In initial efforts to confine the L/?/ mutation to a point between two markers, animals of 
the [SWR X C3H/HeJ] x C3H/HeJ backcross were examined. Raw data are not presented as this 
was essentially a range-finding study. A total of eleven D4MIT markers flanking the gene could 
be used for genotypic analysis, as the two parental genotypes could be distinguished on the basis 
25 of length differences. The crossover frequency between each of these markers and the Lps gene 
in a panel of 493 mice is illustrated in Figure 1. A cluster of four markers (D4MIT82, 325, 218, 
and 244) which underwent no recombination with one another each exhibited a single crossover 
event with the Lps mutation. This cluster of markers lay centromeric to the mutation. On the 
distal side of the mutation, D4MIT80 was separated from Lp/ by 14 recombination events. 
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From the fact that only a single crossover event was observed between D4MIT82 and 
Lp/, it was inferred that the mutation might well lie within the proximal half of the D4MIT82 ^ 
D4MIT80 interval. Partial ordering of microsateUite markers between D4MIT82 and D4MIT80 
5 was achieved by Bell, et al,, who employed a series of deletion mutants to map markers in the 
region of the b locus (Bell et aL, 1995). A YAC clone containing D4MIT83 (which was mapped 
to the middle portion of the interval) was isolated, fragmented, and used to identify fresh 
microsateUite repeat polymorphisms. One of these, designated CA83.3, lay distal to the mutation 
and was separated from it by only six crossover events. Moreover a total of six polymorphic 

10 microsatellites (designated CA25.5, CA25.15, CA9.2, CA7.1, CA7.3, and CA7.11) were 
identified in three other YAC clones which were isolated using the markers D4MIT25, NdS9, 
and D4MIT7 . These six new markers all co-localized with the mutation (Figure 2). When 
physical mapping (detailed below) revealed that some of the new markers were separated from 
others of the group by as much as 1.6 Mb of DNA, it appeared certain that a second backcross 

15 would be required to narrow the interval. 

EXAMPLE 4 
[C57BL/6 X C3H/HeJ] x C3H/HeJ Cross 

A total of 1600 F2 naice made by crossing C57BL/6 x C3H/HeJ and backcrossing to 
20 C3H/HeJ were analyzed for recombination in the region of the Lps gene. Those naice that 
showed such recombination were further analyzed for LPS responsiveness, and fine mapping of 
the recombination events was accomplished using novel microsateUite markers. The inventors 
examined markers over a relatively broad range in a screen for crossover events. This range 
spanned the interval D4Mm64 to D4Mrr27, and included D4Mm78. 

25 

A total of 192 crossovers were observed between D4Mm64 and D4MIT27 in the 1600 
meioses studied. 157 of these crossovers occurred between D4Mm78 and D4MIT27, while 35 
crossovers occurred between D4Mm64 and D4Mm78. The location of these events could be 
further assigned based on the identification of six novel polymorphisms within the D4Mm64 
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D4MIT27 interval (Figure 2), Five novel markers (designated A, D, E, 7.3 and 7.11) still co- 
localized with the Lps^ mutation, as did marker D4Mm78. The physical separation between all 
four of these microsatellite markers (which were isolated from two overlapping BAG clones) 
was approximately 1.2 Mb, corresponding to the "zero region" of the critical area. Four 
5 crossovers were observed between Lps^ and marker B. In terms of physical size, the B-> 83.3 
interval corresponds to approximately 2.6 Mb. Because the crossovers between B and 833 are 
essential to confinement of Lps"^ , the primary assay data for each of the seven critical meioses are 
presented in Figure 3. 

10 EXAMPLE 5 

The physical map 

A map of the entire interval between D4MIT325 and marker 83.3 is shown in Figure 4. 
A total of 63 BAG clones and 20YAG clones (only one of which is shown, in red) were isolated 
to span the critical region. Each YAG and BAG clone is drawn to scale according to analysis by 

15 pulsed-field gel electrophoresis. A FISH map is drawn above the assembly, and is in 
approximate agreement with the electrophoretic map. The black bar in Figure 4 indicates the 
region of zero recombination determined on the basis of 493 meioses analyzed by the cross 
[SWR X G3H/HeJ] x G3H/HeJ and 1600 meioses analyzed by the cross [G57BL/6 x G3H/HeJ] x 
G3H/HeJ. This "zero area" is approximately 1.2 Mb in length. The critical region, known with 

20 certainty to contain the Lp/ mutation, extends over a distance of approximately 2.6 Mb, 
circumscribed by markers "B" and D4MIT83.3. 



EXAMPLE 6 
Gene identification 

25 

169 putative exons were identified throughout the "zero area" of the contig, and in 
selected portions that lay beyond the zero area, by exon trapping. 538 clones were isolated from 
the zero area and beyond by hybridization/selection. Some of each of these represented 
fragments of known genes and pseudogenes, including Pappa, HMG-I and KIAA0029 (detected 
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by both exon trapping and hybridization selection). TCP-1 and BTG-1 were detected by 
hybridization selection, though not by exon trapping. Most of the sequences identified by the 
two methods could not be shown to encode authentic, expressed genes despite exhaustive 
analysis through northern blotting and other techniques. 

5 

High density sequence analysis was carried out over an interval containing 24 minimally 
overlapping BACs and one YAC clone. These BACs are depicted in Figure 5. BACs shown in 
blue were sequenced either to completion or to the point of finishing (Le., only a few gaps remain 
between large assemblies of sequence). BACs shown in red were sequenced with to a high 
10 density (several hundred reads each) but not to the point of finishing, and are still considered "in 
progress," YAC 100E4 was analyzed with nearly 1000 reads. The genes and pseudogenes 
detected within the region by homology searches are indicated as well. 

A large number of exons were predicted by GRAIL analysis. However, many of these 
15 were of retroviral origin (e.g., exons within LI, Bl, Bam5, and lAP repeats). Among those that 
were not, some belonged to known genes such as Pappa, Among all GRAIL-predicted, non- 
repetitive exons with ratings of "good" to "excellent," none could be demonstrated to encode 
authentic macrophage-expressed transcripts when used as probes on northern blots. 

20 Pseudogenes encoding fragments of TCP-1, BTG-1, NKR-P12, iaAA0029, HMG-1, and 

cytochrome P450 were identified within the contig. However, all of these genes were either 
fragmentary or contained nonsense mutations. Only two authentic genes were identified within 
the contig. These were the classical marker gene Pappa, encoding a plasma metalloproteinase, 
and the mouse homolog of TLR-4, encoding a transmembrane protein homologous to the IL-1 

25 receptor. The TLR-4 gene resides in BAC clones 309n7 and 152C16. Its sequence (5' 3') 
corresponds to a proximal distal orientation along the chromosome. The complete size and 
intron-exon boundaries of the gene remain to be determined. 
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EXAMPLE 7 

Identification of the Lps^ Mutation that Alters the TLR-4 Protein Cytoplasmic 

Domain 



5 Mutations of the hps gene selectively impede lipopolysaccharide (LPS) signal 

transduction in C3H/HeJ and C57BL/10ScCR mice. Homozygotes display a phenotype 
characterized by resistance to all biological effects of endotoxin and exaggerated susceptibility to 
overwhelming gram-negative infection. The codominant L/?/ mutation of C3H/HeJ mice was 
confined to a 0.9 centiMorgan genetic interval, based on the analysis of 2093 meioses. 

10 

A minimal contig, consisting of 20 BAG clones and one YAC clone, was subjected to 
sequence analysis. Approximately 40,000 sequencing reads were obtained from shotgun-cloned 
genomic DNA, bringing over 1.6 Mb of the central contig to a near-contiguous state and yielding 
rather dense coverage of >95% of the entire critical region, BLAST searches performed on 

15 masked versions of the sequence disclosed dozens of high-scoring homologies with published 
expressed sequence tags (ESTs), but these were excluded from consideration as they could not be 
cloned from macrophage or fetal cDNA libraries of trusted complexity. Several pseudogenes 
were observed, but dismissed because they were found to be fragmentary. GRAIL analyses, 
performed on long contiguous sequences of the central contig using X-GRAIL software, revealed 

20 an abundance of retroviral repeats and scattered non-retroviral exons, most of which proved to be 
derived from pseudogenes. 

Two authentic genes (a portion of the Pappa locus and the entire TLR-4 receptor locus) 
were identified in the entire region, by both BLAST and GRAIL analysis. The orientation and 
25 location of these genes, with respect to the nearest genetic markers, is presented in Figure 6. 
Both genes lie at the centromeric end of the critical region, and at most, only a small 5' fragment 
of Pappa lies distal to marker B. 
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Pappa encodes a secreted metalloproteinase, which lacks any evidence of a 
transmembrane domain. It is expressed by placental tissue, but not by primary macrophages or 
macrophage cell lines (not shown), and for these reasons, as well as its extreme proximity to 
marker B (which is separated from Lp/ by four crossovers in a panel of 1600 meioses), it is 
5 considered a poor candidate. 

In contrast, the TLR-4 locus seemed a particularly attractive candidate, both on the 
grounds of map position, and because the pro-inflammatory IL-1 receptor is also a member of the 
toll receptor family. Further, a human mutation causing co-resistance to LPS and BL-l attests to 

10 the likelihood that the IL-1 and LPS signal transduction proteins share a common downstream 
intermediate. A priori, it would therefore seem likely that the IL-1 and LPS transducers are 
structurally related. Finally, the pro-inflanmiatory signaling potential of TLR-4 is suggested by 
studies in which chimeric versions of human TLR-4, bearing a CD4 sequence in place of the 
native extracellular domain, were shown to be capable of activating NF-6B in human 

1 5 mononuclear cells. 

Accordingly, the TLR-4 cDNAs from C3H/HeJ mRNA, and from the mRNA of several 
LPS-responsive strains of mice (including C3H/HeN) were cloned, by reverse transcription and 
polymerase chain reaction, using primers derived from the genomic sequence. The amplified 
20 product was fragmented by sonication, shotgun cloned into the vector pBluescript, and sequenced 
using an Applied Biosystems model 373 DNA sequencer. 100 reads were aligned using the 
programs phred and phrap (obtained from Brent Ewing and Phil Green, University of 
Washington Genome Center). 

25 A single mutation was observed in the 835 aa coding region of the TLR-4 cDNA derived 

from C3H/HeJ mice. At position 712 (within the cytoplasmic domain), a histidine is predicted 
rather than the proline that resides in mouse TLR-4 from C3H/HeN mice, SWR mice, or 
C57BL/6 mice. Furthermore, the residue is invariant across species, in that endotoxin-responsive 
mice, rats and humans all display a proline in the relevant position (Figure 7a). The same 
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mutation was identified in DNA amplified from the C3H/HeJ genome and directly sequenced. It 
was not observed in DNA amplified from the C3H/HeN genome, nor in DNA from SWR, 
C57BL/6, 129, or DBA-2 mice (Figure 7b). No other difference in sequence has ever been 
observed on comparison of any genomic interval from C3H/HeN and C3H/HeJ mice, which 
5 were, until recently, a single strain. In Fig. 7A the sequence identified as J-toU is the mutant 
protein sequence in which the PRO at residue 712 is mutated to HIS. 

While the TLR-4 cDNA was readily amplified by RT-PCR from macrophage RNA 
derived from C3H/HeJ, C3H/HeN, and C57BL/10ScN mice, it could not be amplified from 

10 macrophage RNA derived from C57BL/10ScCR mice. On the other hand, several low- 
abundance control cDNAs could be amplified from all strains without difficulty (Figure 8a). 
Moreover, though the toll mRNA is relatively scarce, it could be detected on Northern blots 
prepared using total RNA derived from macrophages of the three former strains; it could not be 
detected in total RNA obtained from C57BL/10ScCR mice (Figure 8b). Thus, a cis-acting 

15 mutation- yet to be defined at the genomic level- prevents the expression of processed toll 
mRNA in C57BL/10ScCR mice. 

Since a definable mutation exists within TLR-4 in C3H/HeJ mice, and a severe deficiency 
or complete absence of TLR-4 mRNA expression is observed in C57BL/10ScCR mice, it is clear 
20 that Lps is, in fact, the TLR-4 gene. 

The Lps^ mutation of C3H/HeJ mice was originally described as codominant, in the sense 
that Lps^/Lps^ heterozygotes show intermediate levels of endotoxin response. The point 
mutation f (pro6 his)] that we have identified exerts a dominant negative effect on LPS signal 
25 transduction. Supporting this conclusion, the functionally null (i.e., non-expressed) Lps allele 
represented in C57BL/10ScCR mice is strictly recessive. Since the TLR-4 molecule functions in 
a dimeric state, it is likely that the Lps^ mutation renders interplay between normal and abnormal 
subunits unproductive; hence the codominant phenotype. Alternatively, the mutation may cause 
sequestration of a downstream signaling molecule. 
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Insofar as the cross: 



C3H/HeJ X C57BL/10ScCR 

5 

yields mice that are profoundly unresponsive to LPS, it may also be inferred that a single copy of 
the Lps"^ allele (i.e,, the genotype Lp//-) corresponds to a phenotype as unresponsive as that 
conferred by Lps^lLps^. This observation supports the notion that signal transduction proceeds 
directly through the TLR-4 molecule, and tends to detract from the hypothesis that TLR-4 
10 undergoes interaction with a second plasma membrane protein that acts, in turn, as a primary 
LPS transducer. 

TLR-4 mRNA is expressed predominantly in lymphoid tissues, and the TLR-4 mRNA is 
induced by LPS (Figure 9), suggesting that up-regulation of TLR-4 mRNA may perpetuate cell 
15 activation by LPS. Moreover, the well-known phenomenon of endotoxin tolerance is not likely 
to result from down-regulation of TLR-4. 

In Drosophila, several Toll homologs are now known to exist (including 18-wheeler. 
Toll itself engages an extracellular protein ligand, spdtzle, which promotes signal transduction 
20 via tub and pelle gene products, leading to dissociation of the cytoplasmic inhibitor encoded by 
cactus from a yet-unidentified member of the rel family of transcription factors. This pathway 
culminates in activation of the drosomycin gene, and is required for effective protection against 
fungal infection. 

25 The mammalian ligand for TLR-4 has not yet been identified, and while an endogenous 

ligand may exist, it is equally possible that LPS recognition is the raison d'etre for TLR-4. 
CD14, the best characterized cell surface receptor for LPS, is also a member of the toll 
superfamily, and it is likely that it directly engages TLR-4 upon interaction with LPS, inducing 
signal transduction through the latter protein. 

103 

A: 2259 17(4%BH0 1 !,DOC) 



TLR-4 signal transduction is believed to proceed through activation of MyD88, a 
cytoplasmic protein bearing homology both to the cytoplasmic domain of toll itself and to a pair 
of death domain motifs, originally described in the TNF receptor family. In addition, the 
5 interleukin-1 receptor associated kinase (IRAK) and TNF-receptor associated factor (TRAF-6) 
lie on the TLR-4 signaling pathway, leading to the activation of AP-1 and NF-6B. It has been 
known for some time that NF-6B activation is essential for transcriptional activation of the 
mouse TNF gene, and that TNF subserves many of the effects of LPS, including the lethal effect. 

10 It has recently been reported that human toll-2 cDNA, transfected into 293 cells, can 

promote LPS signal transduction, given co-expression of CD 14, It was also noted that LPS is 
directly bound by soluble dimerized versions of toll-2, and that toll-2 mutants bearing truncations 
in the cytoplasmic domain exert a dominant inhibitory effect on LPS signaling. LPS signal 
transduction via TLR-4 has never been reported. However, the demonstration that Lps is 

15 identical to TLR-4 leaves no room for doubt that TLR-4 is essential for LPS signaling. In mice 
lacking functional TLR-4, toll-2 does not make a substantial contribution to LPS signal 
transduction; hence C3H/HeJ and C57BL/10ScCR mice are entirely refractory to LPS. Though it 
is possible that toll-2, or other members of the toll family, might also be required for LPS 
signaling, the data in hand do not sustain this conclusion. Recently, it was demonstrated that 

20 Chinese hamsters lacking a fucntional copy of a toll-2 gene, still were responsive to endotoxin 
(Heine et aL, 1999), indicating that toll-2 is not essential in LPS signalling. 

Several mammalian toll homologs have now been identified, and several more may exist 
undiscovered. The phenotypic consequences of mutations in genes encoding other members of 
25 the family remain to be seen. Remarkably, C3H/HeJ and C57BL/10ScCR mice are 
developmentally and phenotypically intact, aside from their inability to effectively respond to 
LPS, and to gram-negative infection. Their response to products of gram-positive organisms and 
most other microbes is intact. This fact would suggest that TLR-4 has been retained in evolution 
principally for the purpose of serving the LPS response pathway. Malo and coworkers have 
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recently adduced evidence to suggest that, in birds, distinct allelic forms of Lps may influence 
survival during gram-negative infection. As such, polymorphisms associated with the tenascin 
locus predict outcome following infection of outbred chickens with Salmonella typhimurium. In 
mice, and presumably in birds, the tenascin locus is closely linked to Lps. It is entirely possible 
5 that mutations of the human TLR-4 gene also influence susceptibility to gram-negative infection, 
or its clinical outcome. As two independent mutations of TLR-4 have thus far become fixed in 
mice, it is likely that human populations have also retained TLR-4 mutations. A search for such 
mutations is currently in progress. 

10 EXAMPLE 8 

Genetic Variation at the TLR4 Locus 

Materials and Methods 

Determination of the complete mouse (Tlr4) and human (TLR4) genomic sequences. 

15 The mouse BAG 152C16 (from the 129/J strain; Research Genetics), was earlier shown 

by the inventors to contain the Tlr4 gene in entirety, and a small fraction of Tlr4 was also found 
to reside in the overlapping BAG 309117 (Poltorak et at., 1998). Human TLR4 was identified in 
BAG HOP 15 (Genome Systems) by hybridization screening. All three BAGs were fragmented 
by ultrasound, shotgun cloned into the vector pBluescript-KS, and sequenced extensively using 

20 ABI model 373 and 377 sequencers. 959 reads were obtained from 390117, 1503 reads from 
HOP 15, and 2731 reads from 152G16. The average read length was approximately 700 nt. To 
concentrate data acquisition efforts on the Tlr4 and TLR4 genes themselves, PGR primers were 
fashioned to match regions flanking each gene. A 16 kb fragment was amplified from the mouse 
BAG 152G16, and a 12 kb fragment was amplified from the human BAG 110P15, each 

25 containing all exons of the respective gene. These fragments were also shotgun cloned, and 
sequenced extensively, so that the depth of sequence reached an average of 12 reads over the area 
of greatest interest. Assembly was performed using the programs phred and Phrap (obtained 
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from Brent Ewing and Phil Green, University of Washington Genome Center). Interpretation of 
repetitive elements v^as achieved with the program RepeatMasker (obtained from Arian Smit, 
University of Washington Genome Center). A contiguous high-quality sequence 18974 bp in 
length containing TLR4 was obtained from the human BAC, and a contig 91748 bp in length 
5 containing Tlr4 was obtained from the mouse BAC. Over these intervals, the error rate was 
estimated at <1 per 10"^ bp. The sequences have been posted to Genbank in annotated form 
(accession number AF177767 for the murine sequence, SEQ ID NO:48, and accession number 
AF177765 for the human sequence, SEQ ID NO:47). All data related to mutations are presented 
with reference to these sequences. 

10 

Sequencing DNA from individual human, mouse, chimpanzee, and baboon samples. 

Human DNA samples were obtained from blood anticoagulated with EDTA, obtained 
from healthy laboratory personnel and from non-selected visitors to the Aston Center outpatient 
clinic in Dallas, TX. The samples, 102 in all, were from a population of mixed ethnicity, but 
15 were predominantly obtained from Caucasian donors. Samples were prepared using the Wizard 
Genomic DNA Purification kit (Promega). Mouse DNA, obtained from animals of 35 Mus 
musculus strains, was ordered from the Jackson Laboratories. Chimpanzee and baboon DNA 
were obtained from Dr. Kurt Benirschke (U. of CA, San Diego) and Dr. Gregory Delzoppo 
(Scripps Research Institute), respectively. 

20 

The three principal exons of Tlr4 and TLR4 were amplified independently from all 
human and mouse genomic DNA samples, leaving a margin of approximately 50 bp to each side 
of the exons so as to identify intronic mutations that might alter splicing. An alternative second 
exon, apparent in the cDNA sequence has been reported (Rock et aU 1998) (Gb accession 
25 number U88880) that specifies a truncated and presumably inactive product. It was not analyzed 
in the population survey. 
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All exons of the chimpanzee were amplified and sequenced using the same primers used 
to amplify and sequence the human exons. For the baboon, the first two exons were also 
amplified using these same primers; however, the third exon of the baboon was amplified with a 
5 substituted primer at the 5' end. 

The PGR products were isolated by agarose gel electrophoresis. Exons 1 and 2 were 
sequenced using the same primers that were used for amplification, Exon 3 was sequenced using 
the flanking primers, as well as a collection of eight internal primers. In this manner, the entire 
10 coding region and all splice junctions of the human and mouse sequences could be covered with a 
total of 14 sequencing reads, given that all reads were of high quality. All primers used for 
amplification and sequencing are presented in Table n. 
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Independent assembly of each sample was required as a condition for further analysis, and 
if such assembly failed, additional reads were executed using a secondary collection of primers. 
Thereafter, mutations were identified en mass, by pooling all of the reads from 25 to 30 samples 
at a time and reassembling with the program polyphred, using the phredPhrapPoly script 
(obtained from Natalie Kolker, University of Washington Genome Center). Consed_alpha 
(obtained from David Gordon, University of Washington Genome Center) was used to visualize 
reads and mutations. 

The annotated chimpanzee exon sequences have been submitted to Genbank with the 
accession numbers AF179218, AF179219, and AF179220. The baboon sequences have been 
submitted with the accession numbers [pending]. For the purpose of genetic comparisons, rat 
and hamster Tlr4 sequences were also used; their Genbank accession numbers are AF057025 and 
AFl 53676, respectively. 



Genetic computation 

A 500 MHz DEC-alpha system equipped with 256 Mbytes of memory was used for direct 
analysis of sequence data as described above. In addition to the programs already mentioned, the 
GCG software (version 9.0) was used for alignment analysis. The windows-based program 
Generunner 3.0 (Hastings Software) was used for the design of oligonucleotide primers. A 
spUne curve describing heterogeneity of the Tlr4 polypeptide sequence from different species 
was produced using the program Prism 3.0 (Graphpad Software, Mc). Sequences were prepared 
for submission with the use of the program Sequin 2.90 (obtained from NCBI). 



Results 

Overall structure ofTLR4 and TkA. 

The mouse Tlr4 gene is somewhat longer than its human counterpart, owing to the greater 
length of intronic sequence (15337 bp from beginning to end of transcribed sequence in the 
mouse, SEQ ID NO:48, as compared to 1 1467 bp in the human, SEQ ID NO:47). There are three 
exons in Tlr4, and each corresponds to a homologous sequence in the human gene. A human 
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cDNA sequence (Genbank accession number U88880) that includes a fourth exon, positioned 
between the "normal" first and second intron has been reported (Rock et al, 1998). When 
included in the processed transcript, however, this exon specifies early termination of the 
polypeptide chain. While it is possible that translation is initiated distal to the added stop codon, 
and that a shorter product results in the human than in the mouse, such a situation would be 
unusual given the length of the 5'UTR that would then exist, and the presence of multiple 
upstream initiation codons. Moreover, there is no murine sequence homologous to the 
alternative second exon of the human gene. The biological significance of this exon is therefore 
unclear, and in all likelihood, its inclusion in the mRNA leads to the formation of a 
nonfunctional protein product. 

Neither the human nor the mouse gene display a TATA element or CAAT box in the 
proximal promoter region. A number of conserved promoter and enhancer motifs are apparent 
on alignment of the murine and human 5' flanking sequences. Both Tlr4 and TLR4 lie in the 
midst of repetitive sequences of retroviral origin, and no other genes may be detected in close 
conmiunication with either of them, using homology searches or the gene prediction algorithm 
GRAIL. 

Genetic variation at the human TLR4 locus 

In total, 204 human TLR4 alleles were sequenced in entirety. As such, all but the rarest 
alleles have likely been encountered. According to the formula: 

(l-p) = (l-xf, 

where p is the probability of detecting an allele, x is the actual frequency of the allele in 
population surveyed, and N is the number of alleles examined, there is a 95% chance that any 
allele with a frequency exceeding 1.46 % was detected through sequencing this population. 

In all, 12 mutable sites were found in human TLR4 (Table IH, FIG. 10). Of these, most 
were confined to exon 3, though some were also placed in the second intron. Five changes 
observed in the coding region were silent (i.e., did not produce an amino acid substitution). Five 
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changes did cause amino acid substitutions, one of which affected the cytoplasmic domain, and 
four of which affected the ectodomain. 
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The most common allele in the human population exists at a frequency of 86.3%, 
and 74.5% of the population is homozygous for this allele. The most common TLR4 
polymorphism (designated TLR4-B; Genbank accession number AF 177766) exists at an 
5 alleUc frequency of 6%, and consists of a double amino acid substitution (residues 299 
and 399 of the human polypeptide chain; nts 12874 and 13174 of the gene), one of which 
(residue 299) affects an aspartic acid residue conserved in mice, rats, hamsters, and 
chimpanzees (though differing in the baboon sequence). Though the mutations are 
almost always co-inherited, and hence, must lie in cis with one another, a single instance 
10 of mutation at the 12874 site was observed in the absence of the 13174 mutation. This 
would suggest that the double mutation may have arisen from an ancestral allele that is 
now very rare, and conceivably, it may confer an advantage to carriers. 

Of the other mutations observed, 3 modified relatively conserved amino acid 
15 residues, whereas 4 modified relatively variable residues. 

Genetic variation at the mouse Tlr4 locus. 

Among 35 strains of Mus musculus, 10 different alleles were identified, based on 
mutations occurring at 22 sites with respect to the reference sequence, 13 of which create 

20 amino acid substitutions (Table IV; FIG. 11). Hence, greater variation was observed 
among mice than among humans, with the most common murine allele represented at a 
frequency of only 69%. To a far greater extent in mice than in humans, the ancestry of 
different Tlr4 alleles may be traced, as many deviations from the reference allele occur in 
conjuction with one another. A plausible arrangement of strain relationships is presented 

25 in FIG 12. Some strains have accumulated many more mutations than others. For 
example, the P/J strain Tlr4 gene exhibits eleven mutations that distinguish it from the 
most common haplotype, six of them specifying changes in the Tlr4 amino acid sequence; 
the SEA/GnJ strain differs by nine mutations, and the strains NZW/J and VM/Dk, which 
are identical to one another, differ at six sites. Shared mutations suggest that 

30 interbreeding of some strains took place after their initial mutational separation had 
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occurred, leading to the introduction of groups of mutations by genetic recombination. 
Hence, mice of the P/J, NZW/J, and VM/Dk strains have several mutations that are 
observed in the A/J and BALB/c strains, but also lack some of the mutations of the latter 
strains, and have unique mutations of their own. 
5 ' LEGEND 

1. NZO/HELt 2. Sl/Col 3. DBA/U 4. A/J 5. EL/Suz 6. CBA/J 7. AKR/J 8. BALB/Cj 9. 
Ddy/jCL 10. P/J 11. MRL/MpJ 12. SJL/J 13. NOD/LtJ 14. 129/J 15. FL/lRe 16. MA/MyJ 
17. SWR/J 18. LP/J 19. PRO/lReJ 20. SODl/Ei 21. SEA/GnJ 22. SM/J 23. KK/HU 24. 
ST/bJ 25. WB/Re 26. YBR/Ei 27. FVB/NJ 28. PI7J 29. LT/ChReSv 30. RIHS/J 31. RF/J 
10 32. NZB/BINJ 33. AU/SsJ 34. NZW/LacJ 35. VM/Dk 



TABLE IV. 



POLYMORPfflSM OF THE Tlr4 Among Mice 


Mouse^ 


Nucleotide 
(Genomic) 


Exon/ 
Intron 


AA 


Receptor 
Domain* 


Conserved 


10 


26400 A -»G 


Exon 2 








4,8,21,22 


37685: (T)io 


Intron 2 








34,35 


37685: (T)i2 


Intron 2 








23 


37754 G-» A 


Exon 3 


94D^N 


Ecto 


Yes 


4,8,10,20,2 
1,34,35 


38101 G^ A 


Exon 3 


209 M I 


Ecto 


No 


21 


38130 A -»G 


Exon 3 


219D^G 


Ecto 


No 


4,8,21 


38234 G^ A 


Exon 3 


254 V I 


Ecto 


Yes 


10 


38584 A ^ G 


Exon 3 








21 


38742 A -»T 


Exon 3 


423 Q L 


Ecto 


No 


10 


38794 G A 


Exon 3 








10 


38903 G^T 


Exon 3 


477 A ^ S 


Ecto 


Yes 


18 


39020 A^G 


Exon 3 


516T^A 


Ecto 


No 


4,8,10,20,2 
1,22,34,35 


39199 C ^ T 


Exon 3 
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4,8,10,20,2 
1,34,35 


39253 A ^ C 


Exon 3 


593 E ^ D 


Ecto 


No 


23 


39273 A T 


Exon 3 


600 N-^ I 


Ecto 


No 


10 


39383 G ^ A 


Exon 3 


637 V ^ I 


TM 


Yes 


19 


39604 T-*C 


Exon 3 








4,8,20,21,3 
4,35 


39631 C^T 


Exon 3 










4,8,20,21,3 
4,35 


39756 G ^ A 


Exon 3 


761 R^H 


Cyto 


Yes 


18 


39826 T->C 


Exon 3 








10 


39907 T ^ G 


Exon 3 


SllN^K 


Cyto 


No 



a 

m 
m 

m 

m 

m 

O 
=0 

ffl 

m 
m 
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As in the human TLR4 gene, most of the murine mutations reside within exon 3, 
and only two substitutions are noted to modify the cytoplasmic domain (FIG 12). Of 
5 these, however, one mutation (R761H), is fairly common among the strains surveyed, and 
corresponds exactly to the human mutation (R763H), observed in one individual out of 
the 102 surveyed. The same residue has been reported as an H in the hamster. A single 
conservative substitution (V637I) was noted within the transmembrane domain of the P/J 
strain. 

10 

Anthropoid ape and lower primate TLR4 sequences, and their relationship to the human 
and rodent sequences 

The human and chimpanzee amino acid sequences are nearly identical over the 
interval studied, distinguished only by three amino acid substitutions. The baboon 

15 sequence is 93.5% identical to the human in the ectodomain, differs in the transmembrane 
domain by one substitution out of 30 residues, and differs in the proximal cytoplasmic 
domain by only 1 residue in 155. At the C-terminus, however, homology is badly 
disrupted, so that 16 of the last 21 human residues are not replicated in the baboon 
protein, which is 13 amino acids shorter than the human protein. Similarly, among 

20 rodents, the C-terminus of the protein is the least conserved. Overall, the order of 
conservation with respect to domain is: 

proximal cytoplasmic domain > transmembrane domain > ectodomain > distal 
cytoplasmic domain. 

25 

Indeed, interspecific comparisons of complete Tlr4 amino acid sequence confirm 
the existence of a hypervariable region at the distal end of the Tlr4 cytoplasmic domain 
(Table V, FIG. 13). 
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TABLE V. 



Homology among functional domains of Tlr4 from six species. Percentages refer to 
identity on Fast- A comparison. Ecto, residues 1-631; TM, residues 632-662; Proximal, 
residues 663-819; Distal, residues 820-839. 



ECTO 



Human 


100% 










^^;-^vr: ;;;;;;;;; 




99.6% 


100% 












91.5% 


91.5% 


100% 








Rat 


61.3% 


60.7% 


59.4% 


100% 








61.9% 


62.0% 


60.1% 


82.9% 


100% 






64.3% 


64.2% 


62.5% 


73.8% 


74.8% 


100% 










Rat 


Mouse 





TM 





100% 














100% 


100% 












97.1% 


97.1% 


100% 










61.1% 


61.1% 


67.7% 


100% 








70.6% 


70.6% 


70.6% 


91.2% 


100% 






73.5% 


73.5% 


73.5% 


79.4% 


79.4% 


100% 






Ciutnp 

i -A- 1 




I 1 







PROXIMAL 





100% 














99.4% 


100% 




■, ,, ,:;iNlifflli!ii| 








99.4% 


98.7% 


100% 








Rat 


91.7% 


91.0% 


91.0% 


100% 






~ 


90.4% 


89.7% 


89.7% 


98.1% 


100% 




Hamster 


91.7% 


91.0% 


91.0% 


97.4% 


95.5% 


100% 














Hamster . 



DISTAL 





100% 






ft 






liiife "Sii 




100% 


100% 




'i 










50% 


50% 


100% 


tm 
... 


mm 








38.1% 


38.1% 


NS 


100% 








26.3% 


26.3% 


NS 


63.2% 


100% 






40.9% 


40.9% 


NS 


40.9% 


45.5% 


100% 





Human 




Baboon 


Rat ^ . 


Mouse 
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EXAMPLE 9 

Tlr4-MEDIATED LPS SIGNAL TRANSDUCTION IN MACROPHAGES 



5 The mammalian LPS sensor is formed by a complex array of proteins, some of 

which may as yet be unknown. Complexes of LPS and LBP are initially engaged at the 
cell surface by CD 14, a glycosphosphoinositol-linked protein with no cytoplasmic 
domain (Wright et al, 1990). The biological relevance of CD 14 in LPS signaling, 
originally deduced from binding and transfection studies (Wright, 1990; Kirkland et aL, 
10 1990), is supported by gene knockout data, which revealed that CD14 expression is 
required for normal sensitivity to LPS (Haziot et aL, 1996, However, it was assumed that 
a second, membrane-spanning protein must ultimately transduce the LPS signal, since 
CD 14 lacks a cytoplasmic domain. 

15 The protein that Ukely fulfills this role was identified through studies of mice, 

wherein mutations of a single gene (Lps) have long been known to abolish LPS signal 
transduction (Sultzer, 1968; Watson et aL, 1978). The Lp/ allele, represented in 
C3H/HeJ mice, is codominant in the sense that Lps'^ILps'' heterozygotes exhibit 
intermediate sensitivity to LPS (Rosenstreich et aL, (1978), and their macrophages 

20 display intermediate levels of TNF production in response to LPS. The innominate non- 
responder allele represented in C57BL/10ScCr mice is recessive to the wild-type allele, in 
that heterozygotes display normal responses to LPS (Coutinho and Meo, 1978). Using 
positional methods, the inventors have determined that Lps encodes the toll-like receptor 
4 (Tlr4), a single- spanning transmembrane protein with a leucine-rich ectodomain and a 

25 "Toll-like" cytoplasmic domain (Poltorak et aL, 1998a; Poltorak et aL, 1998b). The Lp/ 
allele bears a missense mutation (2342 C_A; P712H) which lies within the cytoplasmic 
domain of the polypeptide chain. The mutation in C57BL/10ScCr mice is a null allele. 
Relying on these data (Qureshi et aL, 1999a), other workers confirmed the presence of the 
mutations (Qureshi et aL, 1999b). Moreover, Hoshino and colleagues demonstrated that 

30 a Tlr4 knockout produces an excellent phenocopy of the naturally occurring Tlr4 
mutations (Hoshino etaL, 1999). 
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In order to directly examine the role of Tlr4 as a transducer of the LPS signal in 
macrophages, the inventors expressed the normal mouse protein- and various mutant 
forms- in RAW 264.7 macrophages have been expressed (FIG. 14; Table Vm). These 
5 cells of murine origin are highly responsive to LPS, and are known to express the Tlr4 
mRNA (Poltorak et aL, 1998b). Unlike cell lines of non-myeloid origin (i.e., 293 cells or 
CHO ceils), they express all proteins required for the elicitation of a biologically relevant 
response to LPS (e.g., TNF production). As such, they may be used to determine whether 
Tlr4 is a limiting factor in the initiation of an LPS signal, and to examine the mechanism 

10 by which the Tlr4^^''^ allele exerts its dominant inhibitory effect on signaling. This was 
determined by performing more than 3,000 assays of TNF production, induced over an 
extensive range of LPS concentrations in numerous stable clones, since individual clones 
show considerable background variability in LPS responses. Moreover, because culture 
conditions can affect the sensitivity of the TNF assay and production of TNF by LPS- 

15 stimulated cells, all transfected and control clones were induced and assayed in parallel. 
Statistical analyses of shifts in the EC50 values determined for individual clones were 
then applied in order to determine the influence of recombinant protein expression. 

RAW 264.7 cells were first cotransfected with cDNAs derived from the Tlr4^'''' 
20 and Tlr4^''^ alleles and with a vector encoding neo. The amino terminus of each protein 
was flag-tagged to permit measurement of expression at the cell surface using the 
monoclonal antibody M2 (obtained from Sigma). After G418 selection, stable clones 
expressing each Tlr4 isoform or transfected with vector alone were examined for LPS 
signal transduction. The binding isotherm from one such clone, transfected with the 
25 Tlr4^^"" construct, is displayed in FIG. 15 A, and the flag copy number for clones bearing 
each construct is displayed in FIG. 15B. Despite the use of a strong promoter, the mean 
copy number rarely exceeded 3 x 10^ per cell, and the range of expression among all 
clones spanned less than an order of magnitude. The relatively low copy number 
achieved is consistent with the possibility that surface expression may be limited by the 
30 level of co-expression of other proteins (e.g. MD-2 (Shimazu et aL, 1999)), and the lower 
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copy number in clones bearing truncated constructs as compared with full- length 
constructs may reflect diminished stability. 

Composite EC50 analysis revealed strong augmentation of the LPS response (a 
5 30-fold leftward shift of the curve) resulting from modest over-expression of the normal 
protein; even stronger suppression (a 2600-fold shift to the right) was observed with 
expression of the mutant isoform. Hence, with respect to the over-expression of Tlr4^^"" 
and Tlr4^^'*^ proteins, a 74,000-fold difference in the mean response is apparent at the 
EC50 point (FIG, 15C). A more conservative approach, based on measurement of the 

10 LPS EC50 for each individual clone, also revealed that the Tlr4^^'^ isoform strongly 
enhances LPS sensitivity, while the Tlr4^^^"^ isoform strongly suppresses it (FIG. 15D). 
According to this method of estimation, the composite range of the responsiveness (mean 
EC50 of Tlr4^P'''^ transfected clones/mean EC50 of Tk4^'"" transfected clones) was 1 120- 
fold. Taking the dimension of receptor number into account (FIG. 15E), and excluding 

15 the statistical contribution of the clones transfected with vector alone, no significant 
correlation was observed between LPS signal intensity and the absolute number of 
recombinant Tlr4 molecules on the cell surface. This suggests that the number of 
artificial receptors expressed (usually --1-2 x 10"^) is well in excess of the number of 
native receptors, and that maximum augmentation of LPS response is achieved in each 

20 clone transfected with Tlr4^'"'', with variation attributable to other factors. It is 
interesting in this regard that, on a linear scale, the inhibitory effect of TM^"""* over- 
expression vastly exceeds the augmenting effect of Tlr4^^"" expression. In fact, most of 
the macrophage clones were rendered virtually unresponsive to LPS through over- 
expression of Tlr4^'-^ (LPS EC50 > 10 ^g/ml). 

25 

Several conclusions can be drawn from these results. First, the dominant effect 
of the TM^^'"^ allele may be directly demonstrated through transfection-based expression 
of the protein at moderate levels in an LPS-responsive macrophage cell line. Second, 
since over-expression of Tlr4^^'" augments the LPS response, the intensity of the LPS 
30 signal is normally limited by the quantity of Tlr4 protein on the macrophage membrane. 
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Third, since there appears to be little correlation between the absolute number of 
recombinant receptors expressed and the magnitude of augmentation or inhibition 
achieved, it may be inferred that saturation of another component of the signaling 
cascade, either proximal or distal to Tlr4, occurs when the level of Tlr4 expression 
5 exceeds a certain threshold: perhaps in the range of several thousand copies per cell. By 
the same token, the level of endogenous Tlr4 expression is probably very low: perhaps 
lower than 10 (Michalek et al, 1980) copies per cell (a finding consistent with the fact 
that Tlr4 mRNA is of very low abundance (Poltorak et al, 1998b), but nonetheless 
remarkable in view of the global effects wrought by activation of the receptor). Fourth, 

10 and perhaps most important of all, since rather modest changes in the level of normal Tlr4 
protein expression or the expression of a dominant negative Tlr4 isoform can shift LPS 
sensitivity over a range that spans three to four orders of magnitude, one may infer that 
Tlr4 is of preponderant importance in LPS signal transduction: there is little room for the 
belief that independent pathways act to transduce the LPS signal as well. This last point 

15 is fully supported by the observation that mutation or deletion of the Tlr4 locus can 
completely abrogate LPS signaling (Sultzer, 1968; Coutinho and Meo, 1978; Poltorak et 
al, 1998b). 

In one model, the dominant suppressive effect of TM^''"^ might be ascribed to the 
20 postulated multimeric structure of the Tlr4 protein (Schneider et aL, 1991; Medzhitov et 
al, 1997), given that unproductive association between normal and abnormal subunits 
yields inhibition of signaling. If association between subunits is principally dependent 
upon ectodomain contacts, one would predict that any mutation that disrupts function of 
the Tlr4 cytoplasmic domain might impede signal transduction in a dominant fashion, just 
25 as observed with Tk4^^'^. To examine this hypothesis, the inventors expressed a 
truncated version of the Tlr4 protein, lacking the entire cytoplasmic domain. This protein 
was well expressed on the cell surface, but had only a weak inhibitory effect on LPS 
signaling, which fell short of significance according to the more stringent method of 
analysis (FIG. 16A and FIG. 16B), Hence, the TM^p' "* isoform exerts a strong dominant 
30 effect whereas deletion of the entire cytoplasmic domain does so weakly at most. The 

126 

A: 22601 6(4%#801 '.DOC) 



inventors conclude that, if the model of dominant inhibition based on multimeric 
structure is correct, cytoplasmic domain contacts, in addition to ectodomain contacts, 
must serve to maintain the holoprotein in a multimeric state. Given free exchange 
between subunits, mutant Tlr4 proteins lacking a cytoplasmic domain would predictably 
5 be excluded from the signaling complex in favor of intact subunits that interact more 
strongly with one another. 

The biological consequence of interactions between normal and mutant 
cytoplasmic domains and the intact Tlr4 protein were tested directly. Upon expressing 

10 membrane-associated versions of the Tlr4 cytoplasmic domain (with either the normal or 
mutant sequence and an exteriorized flag peptide, but lacking the entire ectodomain), the 
inventors observed that the all-cytoplasmic Tlr4^'"'^ isoform inhibited LPS signaling 
(rightward EC50 shift of 39-fold; p=0.0019). The Tlr4^'"" isoform was a significantly 
weaker inhibitor than the Tlr4^'''* isoform (p=0.0214), though it also blocked signaling 

15 when overexpressed (rightward EC50 shift of 4-fold; p=0.035). These results suggest 
that the Tlr4^^"^ mutation not only permits interaction between normal and mutant 
subunits, but actually enhances the interaction, or actively engages downstream signaling 
molecules, thus blocking the LPS response (FIG. 17A and FIG. 17B). 

20 The primacy of Tlr4 in LPS signal transduction was originally suggested by 

positional cloning data (Poltorak et al, 1998a; Poltorak et al, 1998b), Direct 
confirmation of Tlr4's function as the transmembrane signaling component of the LPS 
receptor complex has lagged, due largely to the difficulties inherent in transiently 
transfecting macrophage lines with high efficiency. Rather, there has been a tendency to 

25 utilize non-macrophage lines (e.g., HEK 293 cells) in transfection-based studies of LPS 
signaling. However, these cells do not replicate the biological response phenotype of 
authentic macrophages. In particular, they do not produce TNF in response to LPS; hence, 
a surrogate endpoint of response (nuclear translocation of NF-B, measured using a 
transcriptional reporter) has been adopted in most such studies. However, it is quite clear 

30 that NF-B translocation cannot be equated with an LPS response, since many stimuli that 
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elicit NF-B Translocation yield effects that have little or nothing in common with the LPS 
response in other respects. Moreover, it has recently been shown that mutational 
inactivation of MyD88, which is known to engage Tlr4, leads to a state of profound LPS 
unresponsiveness, though permitting NF-B Translocation, Finally, given that a response 

5 of any kind is observed in non-macrophage cell lines, there exists no standard for 
comparison. It has never been clear, for example, that the magnitude of the NF-B 
response approaches that witnessed in a normal macrophage over an identical range of 
LPS concentrations, nor is it known what effect this might have in a macrophage. These 
technical issues have, to date, confounded interpretation of which molecule actually does 

10 transduce the LPS signal, a role previously ascribed to Tlr2 (Yang et ai, 1998; 
Kirschning et al, 1998) but now clearly attributable to Tlr4. 

The present data reveal that Tlr4 is the hmiting factor in LPS signal transduction 
in LPS responsive macrophages. Over-expression of Tlr4 in cells that already express it 

15 augments the LPS response, by about 30-fold on average. The relationship between the 
level of Tlr4 expression and biological response indicates that, although other proteins 
fulfill indispensable functions in LPS signal transduction both upstream (Wright et al, 
1990) and downstream (Kawai et aL, 1999; Muzio et al, 1998; Medzhitov et al, 1998) 
from Tlr4, the quantity of Tlr4 expressed is an important limiting factor in the intensity of 

20 the signal fliat is evoked. Hence, sensitivity to LPS is likely controlled through 
modulation of Tlr4 biosynthesis or activity. Priming by interferon (Pace et al, 1985; Lau 
and Livesey, 1989; Hayes and Zoon, 1993) or by treatment with facultative intracellular 
pathogens (Vogel et aL, 1980; Haranaka et aL, 1984; Matsuura and Galanos, 1990) can 
greatly enhance sensitivity to LPS, while corticosteroids create a state of LPS resistance 

25 (Beutler et aL, 1986). Such modulation may be achieved through alteration of Tlr4 
structure or expression, or alternatively, through changes in sensitivity to the signal that 
Tlr4 initiates, or changes in the intensity of the signal that Tlr4 receives. 

The over-expression of a membrane-anchored Tlr4 ectodomain (lacking any of the 
30 wild-type cytoplasmic domain) inhibited the LPS response only weakly, if at all. This 
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failure of the overexpressed ectodomain to block signaling by a competitive mechanism 
implies that upstream components of the signal transduction pathway must either exist in 
excess with respect to Tlr4, or must interact with Tlr4 at very low affinity. There is now 
good reason to doubt the proposal (Modlin et al, 1999) that the expression of soluble 

5 Tlr4 might prove an effective means of interdicting the LPS signal in vivo, particularly in 
view of the fact that a membrane-anchored form of the protein would be sterically 
positioned to exert such an effect with maximum efficiency, while a soluble form would 
not be. It is, for example, possible to calculate the local concentration of Tlr4 ectodomain 
achieved through over-expression of a membrane-associated version of the protein, 

10 Assuming that the Tk4 ectodomain resides within a space that is 100 A "deep" from its 
most apical point to the surface of the membrane, and further assuming that the 
macrophage is a spherical body with a 15 jiiM radius, the expression of 2 x 10^ receptors 
per cell corresponds to a protein concentration of 1.2 x 10"^ M, or 840 jig of ectodomain 
per ml. While it might be possible to achieve such concentrations of soluble ectodomain 

15 in vivo, it would not be easy to do so, and at that, little or no attenuation of the LPS signal 
would be anticipated. On the other hand, interventions that inhibit contact between Tlr4 
subunits would be likely to have a pronounced impact on signal transduction. 



TABLE Vm 



Construct 


Residues expressed 


Copies/cell 


ASD 


n 




22 - 835 


2.1 X 10^ 


4.1 X 10^ 


10 




22-835 (P712H) 


2.0 X 10^ 


1.4x10"^ 


10 


TM^' " Cyt. Dom. 


630 - 835 


1.2 X 10^ 


3.8 X 10^ 


7 


rW^'" Cyt. Dom. 


630 - 835 
(P712H) 


9.9 X 10^ 


4.0 X 10^ 


8 


Ectodomain 


22 - 660 


1.4 X 10^ 


5.1 X 10^ 


9 



20 
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All of the compositions and methods disclosed and claimed herein can be made 
and executed without undue experimentation in light of the present disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied 

5 to the compositions and methods and in the steps or in the sequence of steps of the 
method described herein without departing from the concept, spirit and scope of the 
invention. More specifically, it will be apparent that certain agents which are both 
chemically and physiologically related may be substituted for the agents described herein 
while the same or similar results would be achieved. All such similar substitutes and 

10 modifications apparent to those skilled in the art are deemed to be within the spirit, scope 
and concept of the invention as defined by the appended claims. 
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WHAT IS CLAIMED IS: 



1. A method for screening for susceptibility to infection comprising: 
a) obtaining sample nucleic acid from an animal; and 

5 b) analyzing the sample nucleic acid to detect a mutation in a gene encoding 

a TLR-4 polypeptide relative to a sequence of a gene encoding a native 
TLR-4 polypeptide; 

wherein a mutation in the gene encoding the TLR-4 polypeptide is indicative of 
susceptibility to infection, 

10 

2. The method of claim 1, wherein the sample nucleic acid is DNA. 

3. The method of claim 1, wherein the step of analyzing the sample nucleic acid 
comprises sequencing the nucleic acid to obtain a sequence. 

15 

4. The method of claim 3, wherein the obtained sequence is compared to a nucleic 
acid sequence from a gene encoding a native TLR-4 polypeptide. 

5. The method of claim 1, wherein the sequence of the gene encoding a native TLR- 
20 4 polypeptide has a nucleic acid sequence set forth in SEQ ID NO:l, SEQ ID N0:3; SEQ 

ID NO:5, SEQ ID NO:46, SEQ ID NO:47 or SEQ ID NO:48. 

6. The method of claim 1, wherein the native TLR-4 polypeptide has an amino acid 
sequence set forth in of SEQ ID NO:2, SEQ ID NO:4, SEQ ED NO:6, SEQ ID NO: 98 or 

25 SEQ ID NO:99. 

7. The method of claim 1, wherein the sample nucleic acid comprises a mutation that 
results in a change in amino acid sequence of the encoded TLR-4 polypeptide relative to 
the amino acid sequence of a native TLR-4 polypeptide. 

30 



A:226016(4%#80I![XX:) 



137 



8. The method of claim 7, wherein the sample nucleic acid comprises a mutation that 
results in a change from PRO to HIS at residue 712 of the native TLR-4 polypeptide 
amino acid sequence of SEQ ID NO:99. 

5 9. The method of claim 7, wherein the sample nucleic acid comprises a mutation that 
results in a change from GLU to LYS at residue 178, a change from ARG to HIS at 
residue 763, a change from GLN to ARG at residue 188, a change from ASP to GLY at 
residue 299, a change from ASN to SER at residue 329, a change from GLU to LYS at 
residue 474, a change from ARG to HIS at residue 763, a change from TYR to CYS at 
10 residue 46, a change from PRO to HIS at residue 145, a change from CYS to TYR at 
residue 281, a change from ASN to HIS at residue 624, or a change from THR to ELE at 
residue 399 of the native TLR-4 polypeptide amino acid sequence of SEQ ID NO:98, 

10. The method of claim 9, wherein the sample nucleic acid comprises at least a 
1 5 second mutation. 

11. The method of claim 10, wherein said second mutation results in a deletion of 
VAL-GLY-THR at residues 827-829 of the native TLR-4 polypeptide amino acid 
sequence of SEQIDNO:98. 

20 

12. The method of claim 3, wherein the sample nucleic acid comprises at least one 
point mutation relative to a nucleic acid sequence from a gene encoding a native TLR-4 
polypeptide. 

25 13. The method of claim 12, wherein the sample nucleic acid comprises at least two 
point mutations relative to a nucleic acid sequence from a gene encoding a native TLR-4 
polypeptide. 

14. The method of claim 12, wherein said mutation is in nucleotide 2342 of the 
30 nucleic acid sequence of SEQ ID NO:46. 
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15. The method of claim 14, wherein said at least one mutation is a change from 
nucleotide C to nucleotide A at position 2342 of the nucleic acid sequence. 

5 16. The method of claim 12, wherein said at least one mutation is in Exon 2, Exon 3 
or Mron 2 of the sequence of SEQ ID NO:47. 

17. The method of claim 16, wherein said at least one mutation is a change from 
nucleotide A to nucleotide G at position 8457, a change from nucleotide G to nucleotide 

10 A at position 8612, a change from nucleotide A to nucleotide G at position 8631, a 
change from nucleotide A to nucleotide G at position 12245, a change from nucleotide T 
to nucleotide C at position 12293, a change from nucleotide C to nucleotide A at position 
12412, a change from nucleotide C to nucleotide A at position 12413, a change from 
nucleotide A to nucleotide G at position 12541, a change from nucleotide G to nucleotide 

15 A at position 12820, a change from nucleotide A to nucleotide G at position 12874, a 
change from nucleotide A to nucleotide G at position 12964, a change from nucleotide C 
to nucleotide T at position 13174, a change from nucleotide G to nucleotide A at position 
13398, a change from nucleotide G to nucleotide A at position 13769, a change from 
nucleotide A to nucleotide C at position 13848, a change from nucleotide G to nucleotide 

20 A at position 13937, or a change from nucleotide G to nucleotide A at position 1 14266 of 
the sequence of SEQ ID NO:47. 

18. The method of claim 16, wherein said at least one mutation is a deletion of 
nucleotide T at position 12228 of the sequence of SEQ ID NO:47. 

25 

19. The method of claim 16, wherein said at least one mutation is a change from 
nucleotide A to nucleotide G at position 12245 of the gene sequence and a deletion of 
nucleotides 14453 to 14461 of the sequence of SEQ ID NO:47. 
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20. The method of claim 16, wherein the sample nucleic acid sequence comprises at 
least two mutations relative to the sequence of SEQ ID NO:47. 

21. The method of claim 20, wherein said at least two mutations comprise a change 
5 from nucleotide C to nucleotide T at position 12399 and a change from nucleotide G to 

nucleotide A at position 12510, a change from nucleotide C to nucleotide A at position 
12413 and a change from nucleotide G to nucleotide A at position 14266, or a change 
from nucleotide A to nucleotide G at position 12874 and a change from nucleotide C to 
nucleotide T at position 13174 of the sequence of SEQ ID NO:47. 

10 

22. The method of claim 1, wherein the step of analyzing the sample nucleic acid 
comprises PGR, an RNase protection assay, or an RFLP procedure. 

23. A method of reducing susceptibility of an animal to infection comprising the step 
15 of modulating an LPS mediated response in the animal. 

24. The method of claim 22, further comprising the step of diagnosing an animal with 
an infection via analysis of a TLR-4-encoding nucleic acid sequence for a mutation 
relative to a sequence of a gene encoding a native TLR-4 polypeptide. 

20 

25. The method of claim 24, wherein the native TLR-4 polypeptide is a TLR-4 
polypeptide that has the sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ 
ID NO:98 or SEQ ID NO:99. 

25 26. The method of claim 22, wherein the step of modulating LPS receptor function 
comprises providing a TLR-4 polypeptide to the animal. 

27. The method of claim 26, wherein the TLR-4 polypeptide is a native TLR-4 
polypeptide. 

30 
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28. The method of claim 26, wherein the provision of a TLR-4 polypeptide is 
accomplished by inducing expression of a TLR-4 polypeptide. 

5 29. The method of claim 28, wherein the expression of a TLR-4 polypeptide encoded 
in the animal's genome is induced. 

30. The method of claim 28, wherein the expression of a TLR-4 polypeptide encoded 
by a nucleic acid provided to the animal is induced. 

10 

3L The method of claim 26, wherein the provision of a TLR-4 polypeptide is 
accomplished by a method comprising introduction of a TLR-4-encoding nucleic acid to 
the animal. 

15 32. The method of claim 26, wherein the provision of a TLR-4 polypeptide is 
accomplished by injecting a TLR-4 polypeptide into the animal. 

33. The method of claim 23, wherein the step of modulating LPS receptor function in 
the animal comprises providing a modulator of TLR-4 to the animal. 

20 

34. The method of claim 33, wherein the modulator of TLR-4 is an agonist of TLR-4. 

35. The method of claim 33, wherein the modulator of TLR-4 is an antagonist of 
TLR-4. 

25 

36. The method of claim 33, wherein the modulator of TLR-4 modulates transcription 
of a TLR-4-encoding nucleic acid. 

37. The method of claim 33, wherein the modulator of TLR-4 modulates translation 
30 of a TLR-4-encoding nucleic acid. 
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38. A method of screening for modulators of an LPS mediated response comprising 
the steps of: 

a) obtaining a TLR-4 polypeptide; 
5 b) determining a standard activity profile of the TLR-4 polypeptide; 

c) contacting the TLR-4 polypeptide with a putative modulator; and 

d) assaying for a change in the standard activity profile. 

39. The method of claim 38, v/herein the TLR-4 polypeptide has the amino acid 
10 sequence of SEQ ID NO:2, SEQ ID N0:4, SEQ ID N0:6, SEQ ID NO:98 or SEQ ID 

NO:99. 

40. The method of claim 39, wherein the standard activity profile of the TLR-4 
polypeptide is determined by determining the ability of the TLR-4 polypeptide to 

15 stimulate transcription of a reporter gene, the reporter gene operatively positioned under 
control of a nucleic acid segment comprising a promoter from a TLR-4 gene. 

41. A method of modulating an LPS mediated response comprising modulating TLR- 
4 function in an animal. 

20 

42. The method of claim 41, further comprising the step of diagnosing the animal for 
susceptibility to infection via analysis of a TLR-4-encoding nucleic acid sequence for a 
mutation relative to a sequence of a gene encoding a native TLR-4 polypeptide. 

25 43. The method of claim 41, comprising providing a TLR-4 polypeptide to the 
animal. 

44.- The method of claim 43, wherein the TLR-4 polypeptide is a TLR-4 polypeptide 
that has the sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:98 or 
30 SEQ ID NO:99. 
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45. The method of claim 43, wherein the provision of a TLR-4 polypeptide is 
accomplished by inducing expression of a TLR-4 polypeptide. 



5 46. The method of claim 43, wherein the provision of a TLR-4 polypeptide is 
accomplished by a method comprising introduction of a TLR-4-encoding nucleic acid to 
the animal. 

47. The method of claim 43, wherein the provision of a TLR-4 polypeptide is 
10 accomplished by injecting a TLR-4 polypeptide into the animal. 

48. The method of claim 41, wherein the step of modulating TLR-4 function in the 
animal comprises providing a modulator of TLR-4 to the animal. 

1 5 49. The method of claim 48, wherein the modulator of TLR-4 is an agonist of TLR-4. 

50. The method of claim 48, wherein the modulator of TLR-4 modulates transcription 
of a TLR-4-encoding nucleic acid. 

20 51. The method of claim 48, wherein the modulator of TLR-4 modulates translation 
of a TLR-4-encoding nucleic acid. 
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ABSTRACT 



The present invention describes a mutant TLR-4 in mice that does not recognize 
endotoxin and therefore does not stimulate the secretion of TNF from macrophages. 
Methods of detecting the mutation are provided; as are methods screening for drugs that 
may stimulate TNF production. Finally methods of incorporating and expressing the 
mutant TLR-4 genes into a host cell are contemplated. 
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yi ^PWLIART LTMAT.>FFS C LTPGSLNPCI EWPNITiTQC MDQKLSKVPD 



.GT LIMAL.FL SC LRPGSLNPCI EVLPNITYQC MDQNLSKIPH 
* ** ****** ** ** * 

5ASRLRGT LIPAMRFL SC VRFESWEPCV EWPNITYQC MELNFYKIPD 
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DtPSSTKNID LSFNPLKILK SYSFSNFSEL QWLDLSRCEI ETIEDKAWHG 



E^PSSTKNID 
dl^YSTKNLD 

*£ ic 

IfjPFSTKNLD 

rpiHLSNLILT 

ikHLSNLILT 
j** * * 

I^NQLSTLVLT 
^* * 

LSHLSTLILT 



LSFNPLKILK SYSFSNFSEL QWLDLSRCEI ETIEDKAWHG 
* * ir 

LSFNPLKILK SYS.FTNFSQL QWLDLSRCEI ETIEDKAWHG 
** * ** ** *i ♦ * *** 

LSFNPLRHLG SYSFFSFPEL QVLDLSRCEI QTIEDGAYQS 

150 

GNPIQSFSPG SFSGLTSLEN LVAVETKLAS LESFPIGQLI 

GNPIQSFSPG SFSGLTSLEN LVAVETKLAS LESFPIGQLI 

* * *+ * * 

GNPIKSFSPG SFSGLTNLEN LVAVETKMTS LEGFHIGQLI 

♦ *** * ** ** *** * * * * 

GNPIQSLALG AFSGLSSLQK LVAVETNLAS LENFPIGHLK 



iSl . 200 

5lKKLNVAHN FIHSCKLPAY FSNLTNLVHV DLtSYNYIQTI TVNDLQFLRE 

1LKKLNVAHN FIHSCKLPAY FSNLTNLVHV DllSYmiQTI TVNDLQFLRE 
3 * * * * ; * * : 

SLKKLNVAHN LIHSFKLPEY FSNLTNLEHV DLSYNYIQTI SVKDLQFLRE 
s ^ * * • * * * *** ** ** 

ILKELNVAHN LIQSFKLPEY FSNLTNLEHL DLSSNKIQSI YCTDLRVLHQ 

iox 250 

NPQVNLSLDM SLNPIDFIQD QAFQGIKLHE LTLRGNFNSS NIMKTCLQNL 

[pQVNLSLDM SLNPIDFIQD QAFQGIKLHE LTLRGNFNSS NIMKTCLQNL 
I * * * * ** ** * * 

irp^vnlsldl slnpidsiqa c^af^irlhe xt'lrsnfnss nvlkmclqnm 
5:pllnlsldl si^pmnfiqp gafkbiblkk ltlrnnfdsl nvhktciqgl 

i51 300 

Jglhvhrlil gefkdernle ifepsimegl cdvtidbfrl tytndfsddi 
Lglhvhrlil gefkdebnle ifepsimegl cdvtidbfrl tytndfsddi 

|r * * ** * * * * * . . 

rGLHVHRLIL GEFKNERNLE . SFDRSVMEGL CNVSIDEFRL TYINHFSDDI 
t * * * * * * ** ,** * * ***** 

^GLEVHRLVL GEFENEGNLE KFDKSALEGL CNLTIEEFRL AYLDYYLDDI 

BOl ^O 
CFRCLANV SAMSLAGVSI KYLEDVPKHF KRFQSLSIIRC QLKQFPTLDL 

^.FHCLANV SAMSLAGVSI KYLEDVPKHF 
>**** * ** * *** . * 

kN.LNCLZ^ SAMSETGVHI KKI2U)VPRHF. 
t*** * **♦ * **** **** 

CbLENCLTNV SSFSLVSVTI ERVKDFSYNF 



^51 

FLKSLTLTM NKGSISFKKV ALPSLSYLDL 

?5FLKSLTLTM NKGSISFKBCV ALPSLSYLDL 
* * • **★ . *** * 

?FLKSWTLTT NE^DISFGQL - ALPSLRYLDL 
>* ** * * ***** .*** * ** 
3LKRLTFTS NKGGNAFSEV DLPSLEFLDL 

I401 

pRHLDLSFNG AIIMSANFMG LEELQHLDFQ 

jLRHLDLSFNG AIIMSANFMG LEELQHLDFQ 

Ilkyldlsfng VILMSANFMG LEELEYLDFQ 
i ***** 

ILKYLDLSFNG vitmssnflg leqlehldfq 



K8QSLSIIRC QLKQFPTLDL 

KWQSLSIIRC HLKPFPKLSL 
* * **** **** ♦ 

gwqhlelvnc KTGQFPTLECL 

400 

SRNALSFSGC CSYSDLGOJNS 
SJUiJALSFSGC CSYSDLGTOIS 

: * ** * • * 

SRNl^FRGC CS YSDFGTNN 
: ** * * ** 

SRNGLSFKGC CSQSDFGTTS 

4SO 

HSTLKRVTEF SAFLSLEKLL 

HSTLKRVTEF SAFLSLEKLL 
* * - 

HSTLKECVTEF SVFLSLEKLL 
* *** ** * 

HSNLKQMSEF SVFLSLRNLI 



jttoll 
xitoll 
rattlr4 

5 toil 
ntoll 
rattlr4 
humt:ljr4 

3 toll 
ntoll 
r'attlr4 
huiatlr4 

jtoll 
ntoll 
rattlt4 
hiuiatlir4 

3 toil 
ntoll 
rattlr4 
humtlr4 

jtoll 
ntoll 
rattlr4 
ltcttatlr4 

atoll 
nt611 
rattir4 
hcdatl3:4 

jtoll 
ntoil 
ra.ttix:4 
h.uratii:4 



451 i 500 

'TLDISYTNTK IDFDGIFLGL TSLNTLKMAG NSFKDNTLSN vrANTXNLTF 

:rLDISYTNTK IDFDGIFLGL TSLNTLKMAG NSFKDNTLSN VFANTTNLTF 

iLDISYTNTK IDFDGIFLGL ISLNTLKMAG NSFKDNTLSN VFTNTTNLTF 

* ie ic^ ****** *** 

tLDISHTHTR VAFNGIFNGL SSLEVLKMAG NSFQBNFLPD IFTELRNLTF 

ioi 550 
DLSKCQLEQ ISWGVFDTLH BI/QLLNMSHN NLLFLDSSHY NQLYSLSTLD 

DLSKCQLEQ ISWGVFDTLH RLQLLNMSHN NLLFLDSSHY NQLYSLSTLD 
* - ** * * * 

ixDLSKCQLEQ ISRGVFDTLY RLQLLNMSHN NLLFLDPSHY KQLYSLRTLD 
: ic * *** ** * * *r *** *** * * .** 

DLSQCQLEQ LSPTAFNSLS SLQVLNMSHN NFFSLDTFPY KCLNSLQVLD 



^51 

PSFNRIETS- KGILQHFPKS 

fcsFNRIETS- KGILQHFPKS 

CSFNRIETS- KGILQHFPKS 
f * * * * ** * 



600 

LAFFNLTNNS YACICBHQKF LQWVKEQKQF 

LAFFNLTNNS VACICEHQKF LQWVKBQKQF 
* * * * * 

LAVFNLTNNS VACICBYQNF LQWVK25QKMF 
** ****** *• *** 



.TYHFYFHLI LIAGCKKYSR GESIYDAFVI 
5 * * * * *- * 



YSLNHIMTSK KQELQHFPSS lAFLNLTQND FACTCEHQSF LQWIKDQRQL 

501 650 
LVNVEQMTCA TPVEMNTSLV LDFNNSTCYM Y KTTTSVSW SVIWSTmF 

LVNVEQMTCA TPVEMNTSLV LDFNNSTCYM Y KTITSVSW SVIWSTVAF 
****** * ♦ * t 

LVNVEQMKCA SPIDMKASLV LDFTNSTCYI YKTTISVSVy SVLWATVAF 
I * * * * * ***** *** * ** * * -t t t X 

LVEVERMECA TPSDKQGfclPV LSL-NITCQM N KTIIGVSV L SVLWSVVAV 

.31 700 
:.TYHFYFH LI LIAGCKKYSR GESIYDAFVI YSSQNEDWVR NELVKNLEEG 

^SILI LIAGCKKYSR GESIYDAFVI YSSQNEDWVR NELVKNLEEG 

NELVKNLEEG 

NELVKNLEEG 

- 750 
HFIQSRWCIF 

HFIQSRWOIF 

HFIQSRWGIF 

HFIQSRWCIF 

800 

SRNTYLEWED 
SRNTYLEWED 
SRNTYLEWED 
SRNTYLEWED 



YSSQNEDWVR 

.vykfyfhlm: llagcikygr geniydafvi yssqdedwvr 



701 

vprfhlclhy 
vprfklclhy 

* 

VPRFQLCLHY 
VPPFQLCLHY 
751 

EYEIAQTWQF 
EYEIAQTWQF 
EYEIAQ^WQF 
EYEIAQTWQF- 
801 

nplgrhifwr 



RDFIHGVAIA ANIIQEGFHK 
RDFIPGVAIA ANIIQEGFHK 
RDFIPGVAIA ANIIQEGFHK 
RDFIPGVMA anxihegfhk 

lssrsgiifi vlekveksll 

lssrsgufi vlekveksll 

lssrsgiifi vlekveksll 
-* - * * - 

lssragiifi vlqkvektll 
rlknalldgk asnpeqtaee 



SRKVIWVSR 

srkvivwsr 
srkviwvsr 

* 

srkvivwsq 

rqqvblyrll 
rqqvblyrll 
rqqvelyrll 
rqqvelyrll 

G40 

BQETATWT-— 



NPLGRHIFWR RLKNALLDGK ASNPEQTAEE EfQETATWT-- 
. * * * ** * * 

imiGRHXFWR RLKKALLDGK ALNPDETSEE EQEATTLT--- 
** * ** ** *** ******* 

SVLGRHIFWR RLRKALLDGK SWNPEGTVGT GCNWQEATSI 
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Vector vs. Ecto 


0.06S7 


Vector vs. Tlr4^ 
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Vector Ecto Tlr?^^ 
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SEQUENCE LISTING 



<110> BEUTLER, BRUCE 

POLTORAK, ALEXANDER 



<12 0> LPS - RESPONSE GENE COMPOSITIONS AND METHODS 



<130> UTSD:602 



<140> UNKNOWN 
<141> 1999-09-15 



<150> 60/102,392 
<151> 1998-09-29 

<150> 60/100,403 
<151> 1998-09-15 



<160> 99 



<170> Patent In Ver, 2.0 



<210> 1 

<211> 4868 

<212> DNA 

<213> Homo sapiens 



<400> 1 

aaaatactcc 

cagggccact 

cctggctggg 

ggagccctgc 

caaaatcccc 

gaggcattta 

caggtgtgaa 

cttaatattg 

aagtttacag 

tggacatctc 

attacctgag 

gattcaaagt 

ctctttagac 

taggcttcat 

tattcaaggt 

aggaaacttg 

agaattccga 

tttgacaaat 

ttcttataat 

cacattgaaa 

tttttcagaa 

tttcaaaggt 

gagcttcaat 

tctggatttc 

cagaaacctc 

cttcaatggc 

cttccttcca 

tcaactggag 

tatgagccac 



cttgcctcaa 
gctgctcaca 
actctgatcc 
gtggaggtgg 
gacaacctcc 
ggcagctata 
atccagacaa 
acaggaaacc 
aagctggtgg 
aaaactttga 
tatttttcta 
atttattgca 
ctgtccctga 
aagctgactt 
ctggctggtt 
gaaaagtttg 
ttagcatact 
gtttcttcat 
ttcggatggc 
ctcaaatctc 
gttgatctac 
tgctgttctc 
ggtgttatta 
cagcattcca 
atttaccttg 
ttgtccagtc 
gatatcttca 
cagttgtctc 
aacaacttct 



aaactgctcg 
aaaccagtga 
cagccatggc 
ttcctaatat 
ccttctcaac 
gcttcttcag 
ttgaagatgg 
ccatccagag 
ctgtggagac 
aagaacttaa 
atctgaccaa 
cagacttgcg 
atcctatgaa 
taagaaataa 
tagaagtcca 
acaaatctgc 
tagactacta 
tttccctggt 
aacatttaga 
tcaaaaggct 
caagccttga 
aaagtgattt 
ccatgagttc 
atttgaaaca 
acatttctca 
tcgaagtctt 
cagagctgag 
caacagcatt 
tttcattgga 



gtcaaacggt 
ggatgatgcc 
cttcctctcc 
tacttatcaa 
caagaacctg 
tttcccagaa 
ggcatatcag 
tttagccctg 
aaatctagca 
tgtggctcac 
tctagagcac 
ggttctacat 
ctttatccaa 
ttttgatagt 
tcgtttggtt 
tctagagggc 
cctcgatgat 
gagtgtgact 
attagttaac 
tactttcact 
gtttctagat 
tgggacaacc 
aaacttcttg 
aatgagtgag 
tactcacacc 
gaaaatggct 
aaacttgacc 
taactcactc 
tacgtttcct 



gatagcaaac 
aggatgatgt 
tgcgtgagac 
tgcatggagc 
gacctgagct 
ctgcaggtgc 
agcctaagcc 
ggagcctttt 
tctctagaga 
aatcttatcc 
ttggaccttt 
caaatgcccc 
ccaggtgcat 
ttaaatgtaa 
ctgggagaat 
ctgtgcaatt 
attattgact 
attgaaaggg 
tgtaaatttg 
tccaacaaag 
ctcagtagaa 
agcctaaagt 
ggcttagaac 
ttttcagtat 
agagttgctt 
ggcaattctt 
ttcctggacc 
tccagtcttc 
tataagtgtc 



cacgcattca 
ctgcctcgcg 
cagaaagctg 
tgaatttcta 
ttaatcccct 
tggatttatc 
acctctctac 
ctggactatc 
acttccccat 
aatctttcaa 
ccagcaacaa 
tactcaatct 
ttaaagaaat 
tgaaaacttg 
ttagaaatga 
tgaccattga 
tatttaattg 
taaaagactt 
gacagtttcc 
gtgggaatgc 
atggcttgag 
atttagatct 
aactagaaca 
tcctatcact 
tcaatggcat 
tccaggaaaa 
tctctcagtg 
aggtactaaa 
tgaactccct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 



1 



ccaggttctt gattacagtc tcaatcacat 
ttttccaagt agtctagctt tcttaaatct 
acaccagagt ttcctgcaat ggatcaagga 
aatggaatgt gcaacacctt cagataagca 
ctgtcagatg aataagacca tcattggtgt 
tgtagcagtt ctggtctata agttctattt 
gtatggtaga ggtgaaaaca tctatgatgc 
ctgggtaagg aatgagctag taaagaattt 
ccttcactac agagacttta ttcccggtgt 
tttccataaa agccgaaagg tgattgttgt 
gtgtatcttt gaatatgaga ttgctcagac 
catcttcatt gtcctgcaga aggtggagaa 
ccgccttctc agcaggaaca cttacctgga 
cttctggaga cgactcagaa aagccctgct 
agtgggtaca ggatgcaatt ggcaggaagc 
ctcctgaggc atttcttgcc cagctgggtc 
gctgccacat gtcaggcctt atgctaaggg 
gggctgctaa tctcaaggag cttccagtgc 
agtcttccag gtgggcattt caaccaactc 
ttcaactctt acctcatcaa gttgaataaa 
tttcctgagt cttttgaatg gaaattgtat 
tagttttgac tgaactgggt gttcactttt 
acttgatgac tgcagtcgtc aaggggctcc 
tgtctcctta cagatgttaa agtctagtgg 
gctcacaacc atcctggtca ttctcgagca 
atatttttat ttttatatat ccagttttca 
cataaataag gttgtttaag acgtgcttca 
aagtatggaa aagtacactc tgtcactttg 
actaagtaat gactgtcatg aaagcagcat 
ttaaacggga agaaaatttc cgcttcctgg 
ggaaggaagt gggatgacct caggaggtca 
gataaacccg gggtgacctc atgaaatgag 
agtgatgttt gatggacctc tgaatctctt 
cctgtaccct tctcactgcc aggagaacta 
cattgctgtt tcctgttggg caatgctcct 
cattgagaaa acaatgtgtc tggaattaat 
gaatgttcat tccagcttct tcaggaaaca 
gtcatcaggg aaatgaaaat aaaaaccaca 
ctactataaa aaaatgaagt gtcatcaagg 
ctggagggaa tggaaaatgg tgtagccgtt 
ttaaaaatag aactgctata tgatccagca 
ttgaaatcag aatttcaaga aaatatttac 
atcactgttt ccaaagttat ggaaacaacc 
ggaaatgtgc atataacgta caatggggat 
gttatttatg acaacatgaa taaacccgga 
acagaaagac aaatactgcc tgatttcatt 
agaagcagag aatagaacag tggttcctag 
sgggagttgt ctaattggta taaaattata 
gctgtatagc agagttcgta taatgaacaa 
gagggtacct ctcatgttaa gtgttcttac 
ggtgatggat atatttatta ccttgattgt 
ctaaactcat caaattgtat acattaaata 
aaaaaaaa 



aatgacttcc aaaaaacagg aactacagca 1800 
tactcagaat gactttgctt gtacttgtga 1860 
ccagaggcag ctcttggtgg aagttgaacg 1920 
gggcatgcct gtgctgagtt tgaatatcac 1980 
gtcggtcctc agtgtgcttg tagtatctgt 2040 
tcacctgatg cttcttgctg gctgcataaa 2100 
ctttgttatc tactcaagcc aggatgagga 2160 
agaagaaggg gtgcctccat ttcagctctg 2220 
ggccattgct gccaacatca tccatgaagg 2280 
ggtgtcccag cacttcatcc agagccgctg 2340 
ctggcagttt ctgagcagtc gtgctggtat 2400 
gaccctgctc aggcagcagg tggagctgta 2460 
gtgggaggac agtgtcctgg ggcggcacat 252 0 
ggatggtaaa tcatggaatc cagaaggaac 2580 
aacatctatc tgaagaggaa aaataaaaac 2640 
caacacttgt tcagttaata agtattaaat 2700 
tgagtaattc catggtgcac tagatatgca 2760 
agagggaata aatgctagac taaaatacag 2820 
agtcaaggaa cccatgacaa agaaagtcat 2880 
gacagagaaa acagaaagag acattgttct 2940 
tatgttatag ccatcataaa accattttgg 3000 
tcctttttga ttgaatacaa tttaaattct 306 0 
tgatgcaaga tgccccttcc attttaagtc 3120 
ctaattccta aggaaacctg attaacacat 3180 
tgttctattt tttaactaat cacccctgat 3240 
tttttttacg tcttgcctat aagctaatat 3300 
aatatccata ttaaccacta tttttcaagg 3360 
tcactcgatg tcattccaaa gttattgcct 3420 
tgaaataatt tgtttaaagg gggcactctt 3480 
tcttatcatg gacaatttgg gctagaggca 3540 
ccttttcttg attccagaaa catatgggct 3600 
ttgcagcaga agtttatttt tttcagaaca 3660 
tagggagaca cagatggctg ggatccctcc 3720 
cgtgtgaagg tattcaaggc agggagtata 3780 
tgaccacatt ttgggaagag tggatgttat 3840 
ggggttctta taaagaaggt tcccagaaaa 39 00 
ggaacattca aggaaaagga caatcaggat 3960 
atgagatatc accttatacc aggtagatgg 402 0 
atatagagaa attggaaccc ttcttcactg 4080 
atgaaaaaca gtacggaggt ttctcaaaaa 4140 
atctcacttc tgtatatata cccaaaataa 4200 
actcccatgt tcattgtggc actcttcaca 4260 
caaatttcca ttggaaaata aatggacaaa 4320 
attattcagc ctaaaaaaag gggggatcct 43 80 
ggccattatg ctatgtaaaa tgagcaagta 4440 
tatatgaggt tctaaaatag tcaaactcat 4500 
ggaaaaggag gaagggagaa atgaggaaat 4560 
gtatgcaaga tgaattagct ctaaagatca 4620 
tactgtatta tgcacttaac attttgttaa 4680 
catatacata tacacaagga agcttttgga 4740 
ggtgatggtt tgacaggtat gtgactatgt 4800 
tatgcagttt tataatatca aaaaaaaaaa 4860 

4868 
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<210> 2 

<211> 839 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Met Met Ser Ala Ser Arg Leu Ala Gly Thr Leu lie Pro Ala Met Ala 
15 10 15 

Phe Leu Ser Cys Val Arg Pro Glu Ser Trp Glu Pro Cys Val Glu Val 
20 25 30 

Val Pro Asn lie Thr Tyr Gin Cys Met Glu Leu Asn Phe Tyr Lys lie 
35 40 45 

Pro Asp Asn Leu Pro Phe Ser Thr Lys Asn Leu Asp Leu Ser Phe Asn 
50 55 60 

Pro Leu Arg His Leu Gly Ser Tyr Ser Phe Phe Ser Phe Pro Glu Leu 
65 70 75 80 

Gin Val Leu Asp Leu Ser Arg Cys Glu lie Gin Thr lie Glu Asp Gly 
85 90 95 

Ala Tyr Gin Ser Leu Ser His Leu Ser Thr Leu lie Leu Thr Gly Asn 
100 105 110 

Pro lie Gin Ser Leu Ala Leu Gly Ala Phe Ser Gly Leu Ser Ser Leu 
115 120 125 

Gin Lys Leu Val Ala Val Glu Thr Asn Leu Ala Ser Leu Glu Asn Phe 
130 135 140 

Pro lie Gly His Leu Lys Thr Leu Lys Glu Leu Asn Val Ala His Asn 
145 150 155 160 

Leu lie Gin Ser Phe Lys Leu Pro Glu Tyr Phe Ser Asn Leu Thr Asn 
165 170 175 

Leu Glu His Leu Asp Leu Ser Ser Asn Lys lie Gin Ser lie Tyr Cys 
180 185 190 

Thr Asp Leu Arg Val Leu His Gin Met Pro Leu Leu Asn Leu Ser Leu 
195 200 205 

Asp Leu Ser Leu Asn Pro Met Asn Phe lie Gin Pro Gly Ala Phe Lys 
210 215 220 

Glu lie Arg Leu His Lys Leu Thr Leu Arg Asn Asn Phe Asp Ser Leu 
225 230 235 240 

Asn Val Met Lys Thr Cys lie Gin Gly Leu Ala Gly Leu Glu Val His 
245 250 255 

Arg Leu Val Leu Gly Glu Phe Arg Asn Glu Gly Asn Leu Glu Lys Phe 
260 265 270 
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Asp Lys Ser Ala Leu Glu Gly Leu Cys Asn Leu Thr lie Glu Glu Phe 
275 280 285 



Arg Leu Ala Tyr Leu Asp Tyr Tyr Leu Asp Asp lie lie Asp Leu Phe 
290 295 300 

Asn Cys Leu Thr Asn Val Ser Ser Phe Ser Leu Val Ser Val Thr lie 
305 310 315 320 

Glu Arg Val Lys Asp Phe Ser Tyr Asn Phe Gly Trp Gin His Leu Glu 
325 330 335 

Leu Val Asn Cys Lys Phe Gly Gin Phe Pro Thr Leu Lys Leu Lys Ser 
340 345 350 

Leu Lys Arg Leu Thr Phe Thr Ser Asn Lys Gly Gly Asn Ala Phe Ser 
355 360 365 

Glu Val Asp Leu Pro Ser Leu Glu Phe Leu Asp Leu Ser Arg Asn Gly 
370 375 380 

Leu Ser Phe Lys Gly Cys Cys Ser Gin Ser Asp Phe Gly Thr Thr Ser 
385 390 395 400 

Leu Lys Tyr Leu Asp Leu Ser Phe Asn Gly Val lie Thr Met Ser Ser 
405 410 415 

Asn Phe Leu Gly Leu Glu Gin Leu Glu His Leu Asp Phe Gin His Ser 
420 425 430 

Asn Leu Lys Gin Met Ser Glu Phe Ser Val Phe Leu Ser Leu Arg Asn 
435 440 445 

Leu lie Tyr Leu Asp lie Ser His Thr His Thr Arg Val Ala Phe Asn 
450 455 460 

Gly lie Phe Asn Gly Leu Ser Ser Leu Glu Val Leu Lys Met Ala Gly 
465 470 475 480 

Asn Ser Phe Gin Glu Asn Phe Leu Pro Asp lie Phe Thr Glu Leu Arg 
485 490 495 

Asn Leu Thr Phe Leu Asp Leu Ser Gin Cys Gin Leu Glu Gin Leu Ser 
500 505 510 

Pro Thr Ala Phe Asn Ser Leu Ser Ser Leu Gin Val Leu Asn Met Ser 
515 520 525 

His Asn Asn Phe Phe Ser Leu Asp Thr Phe Pro Tyr Lys Cys Leu Asn 
530 535 540 

Ser Leu Gin Val Leu Asp Tyr Ser Leu Asn His lie Met Thr Ser Lys 
545 550 555 560 

Lys Gin Glu Leu Gin His Phe Pro Ser Ser Leu Ala Phe Leu Asn Leu 
565 570 575 
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Thr Gin Asn Asp Phe Ala Cys Thr Cys Glu His Gin Ser Phe Leu Gin 
580 585 590 



Trp He Lys Asp Gin Arg Gin Leu Leu Val Glu Val Glu Arg Met Glu 
595 600 605 

Cys Ala Thr Pro Ser Asp Lys Gin Gly Met Pro Val Leu Ser Leu Asn 
610 615 620 

lie Thr Cys Gin Met Asn Lys Thr He He Gly Val Ser Val Leu Ser 
625 630 635 640 

Val Leu Val Val Ser Val Val Ala Val Leu Val Tyr Lys Phe Tyr Phe 
645 650 655 

His Leu Met Leu Leu Ala Gly Cys He Lys Tyr Gly Arg Gly Glu Asn 
660 665 670 

He Tyr Asp Ala Phe Val He Tyr Ser Ser Gin Asp Glu Asp Trp Val 
675 680 685 

Arg Asn Glu Leu Val Lys Asn Leu Glu Glu Gly Val Pro Pro Phe Gin 
690 695 700 

Leu Cys Leu His Tyr Arg Asp Phe He Pro Gly Val Ala He Ala Ala 
705 710 715 720 

Asn He He His Glu Gly Phe His Lys Ser Arg Lys Val He Val Val 
725 730 735 

Val Ser Gin His Phe He Gin Ser Arg Trp Cys He Phe Glu Tyr Glu 
740 745 750 

He Ala Gin Thr Trp Gin Phe Leu Ser Ser Arg Ala Gly He He Phe 
755 760 765 

He Val Leu Gin Lys Val Glu Lys Thr Leu Leu Arg Gin Gin Val Glu 
770 775 780 

Leu Tyr Arg Leu Leu Ser Arg Asn Thr Tyr Leu Glu Trp Glu Asp Ser 
785 790 795 800 

Val Leu Gly Arg His He Phe Trp Arg Arg Leu Arg Lys Ala Leu Leu 
805 810 815 

Asp Gly Lys Ser Trp Asn Pro Glu Gly Thr Val Gly Thr Gly Cys Asn 
820 825 830 

Trp Gin Glu Ala Thr Ser He 
835 
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<210> 3 

<211> 3811 

<212> DNA 

<213> Homo sapiens 



<400> 3 

acagggccac 

gcctggctgg 

gggagccctg 

gctttcagac 

gccgttttat 

acaaaatccc 

tgaggcattt 

ccaggtgtga 

ccttaatatt 

caagtttaca 

ttggacatct 

aattacctga 

agattcaaag 

tctctttaga 

ttaggcttca 

gtattcaagg 

aaggaaactt 

aagaattccg 

gtttgacaaa 

tttcttataa 

ccacattgaa 

ctttttcaga 

gtttcaaagg 

tgagcttcaa 

atctggattt 

tcagaaacct 

tcttcaatgg 

acttccttcc 

gtcaactgga 

atatgagcca 

tccaggttct 

attttccaag 

aacaccagag 

gaatggaatg 

cctgtcagat 

ttgtagcagt 

agtatggtag 

actgggtaag 

gccttcacta 

gtttccataa 

ggtgtatctt 

tcatcttcat 

accgccttct 

tcttctggag 

cagtgggtac 

cctcctgagg 

tgctgccaca 

agggctgcta 

gagtcttcca 

tttcaactct 

ttttcctgag 



tgctgctcac 
gactctgatc 
cgtggagact 
tccggagcct 
cacggaggtg 
cgacaacctc 
aggcagctat 
aatccagaca 
gacaggaaac 
gaagctggtg 
caaaactttg 
gtatttttct 
tatttattgc 
cctgtccctg 
taagctgact 
tctggctggt 
ggaaaagttt 
attagcatac 
tgtttcttca 
tttcggatgg 
actcaaatct 
agttgatcta 
ttgctgttct 
tggtgttatt 
ccagcattcc 
catttacctt 
cttgtccagt 
agatatcttc 
gcagttgtct 
caacaacttc 
tgattacagt 
tagtctagct 
tttcctgcaa 
tgcaacacct 
gaataagacc 
tctggtctat 
aggtgaaaac 
gaatgagcta 
cagagacttt 
aagccgaaag 
tgaatatgag 
tgtcctgcag 
cagcaggaac 
acgactcaga 
aggatgcaat 
catttcttgc 
tgtcaggcct 
atctcaagga 
ggtgggcatt 
tacctcatca 
tcttttgaat 



agaagcagtg 
ccagccatgg 
tggccctaaa 
cagcccttca 
gttcctaata 
cccttctcaa 
agcttcttca 
attgaagatg 
cccatccaga 
gctgtggaga 
aaagaactta 
aatctgacca 
acagacttgc 
aaccctatga 
ttaagaaata 
ttagaagtcc 
gacaaatctg 
ttagactact 
ttttccctgg 
caacatttag 
ctcaaaaggc 
ccaagccttg 
caaagtgatt 
accatgagtt 
aatttgaaac 
gacatttctc 
ctcgaagtct 
acagagctga 
ccaacagcat 
ttttcattgg 
ctcaatcaca 
ttcttaaatc 
tggatcaagg 
tcagataagc 
atcattggtg 
aagttctatt 
atctatgatg 
gtaaagaatt 
attcccggtg 
gtgattgttg 
attgctcaga 
aaggtggaga 
acttacctgg 
aaagccctgc 
tggcaggaag 
ccagctgggt 
tatgctaagg 
gcttccagtg 
tcaaccaact 
agttgaataa 
ggaaattgta 



aggatgatgc 
ccttcctctc 
ccacacagaa 
ccccgattcc 
ttacttatca 
ccaagaacct 
gtttcccaga 
gggcatatca 
gtttagccct 
caaatctagc 
atgtggctca 
atctagagca 
gggttctaca 
actttatcca 
attttgatag 
atcgtttggt 
ctctagaggg 
acctcgatga 
tgagtgtgac 
aattagttaa 
ttactttcac 
agtttctaga 
ttgggacaac 
caaacttctt 
aaatgagtga 
atactcacac 
tgaaaatggc 
gaaacttgac 
ttaactcact 
atacgtttcc 
taatgacttc 
ttactcagaa 
accagaggca 
agggcatgcc 
tgtcggtcct 
ttcacctgat 
cctttgttat 
tagaagaagg 
tggccattgc 
tggtgtccca 
cctggcagtt 
agaccctgct 
agtgggagga 
tggatggtaa 
caacatctat 
ccaacacttg 
gtgagtaatt 
cagagggaat 
cagtcaagga 
agacagagaa 
ttatgttata 



caggatgatg 
ctgcgtgaga 
gagctggcat 
attgcttctt 
atgcatggag 
ggacctgagc 
actgcaggtg 
gagcctaagc 
gggagccttt 
atctctagag 
caatcttatc 
cttggacctt 
tcaaatgccc 
accaggtgca 
tttaaatgta 
tctgggagaa 
cctgtgcaat 
tattattgac 
tattgaaagg 
ctgtaaattt 
ttccaacaaa 
tctcagtaga 
cagcctaaag 
gggcttagaa 
gttttcagta 
cagagttgct 
tggcaattct 
cttcctggac 
ctccagtctt 
ttataagtgt 
caaaaaacag 
tgactttgct 
gctcttggtg 
tgtgctgagt 
cagtgtgctt 
gcttcttgct 
ctactcaagc 
ggtgcctcca 
tgccaacatc 
gcacttcatc 
tctgagcagt 
caggcagcag 
cagtgtcctg 
atcatggaat 
ctgaagagga 
ttcagttaat 
ccatggtgca 
aaatgctaga 
acccatgaca 
aacagaaaga 
gccatcataa 



tctgcctcgc 
ccagaaagct 
gaaacccaga 
gctaaatgct 
ctgaatttct 
tttaatcccc 
ctggatttat 
cacctctcta 
tctggactat 
aacttcccca 
caatctttca 
tccagcaaca 
ctactcaatc 
tttaaagaaa 
atgaaaactt 
tttagaaatg 
ttgaccattg 
ttatttaatt 
gtaaaagact 
ggacagtttc 
ggtgggaatg 
aatggcttga 
tatttagatc 
caactagaac 
ttcctatcac 
ttcaatggca 
ttccaggaaa 
ctctctcagt 
caggtactaa 
ctgaactccc 
gaactacagc 
tgtacttgtg 
gaagttgaac 
ttgaatatca 
gtagtatctg 
ggctgcataa 
caggatgagg 
tttcagctct 
atccatgaag 
cagagccgct 
cgtgctggta 
gtggagctgt 
gggcggcaca 
ccagaaggaa 
aaaataaaaa 
aagtattaaa 
ctagatatgc 
ctaaaataca 
aagaaagtca 
gacattgttc 
aaccattttg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 
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gtagttttga ctgaactggg tgttcacttt ttcctttttg attgaataca atttaaattc 312 0 

tacttgatga ctgcagtcgt caaggggctc ctgatgcaag atgccccttc cattttaagt 3180 

ctgtctcctt acagaggtta aagtctaatg gctaattcct aaggaaacct gattaacaca 3 240 

tgctcacaac catcctggtc attctcgaac atgttctatt ttttaactaa tcacccctga 3300 

tatattttta tttttatata tccagttttc atttttttac gtcttgccta taagctaata 33 60 

tcataaataa ggttgtttaa gacgtgcttc aaatatccat attaaccact atttttcaag 342 0 

gaagtatgga aaagtacact ctgtcacttt gtcactcgat gtcattccaa agttattgcc 3480 

tactaagtaa tgactgtcat gaaagcagca ttgaaataat ttgtttaaag ggggcactct 3540 

tttaaacggg aagaaaattt ccgcttcctg gtcttatcat ggacaatttg ggctataggc 3 600 

atgaaggaag tgggattacc tcaggaagtc accttttctt gattccagaa acatatgggc 3660 

tgataaaccc ggggtgacct catgaaatga gttgcagcag atgtttattt ttttcagaac 372 0 

aagtgatgtt tgatggacct atgaatctat ttagggagac acagatggct gggatccctc 3780 

ccctgtaccc ttctcactga caggagaact a 3811 

<210> 4 

<211> 799 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Glu Leu Asn Phe Tyr Lys lie Pro Asp Asn Leu Pro Phe Ser Thr 
15 10 15 

Lys Asn Leu Asp Leu Ser Phe Asn Pro Leu Arg His Leu Gly Ser Tyr 
20 25 30 

Ser Phe Phe Ser Phe Pro Glu Leu Gin Val Leu Asp Leu Ser Arg Cys 
35 40 45 

Glu lie Gin Thr lie Glu Asp Gly Ala Tyr Gin Ser Leu Ser His Leu 
50 55 60 

Ser Thr Leu lie Leu Thr Gly Asn Pro lie Gin Ser Leu Ala Leu Gly 
65 70 75 80 

Ala Phe Ser Gly Leu Ser Ser Leu Gin Lys Leu Val Ala Val Glu Thr 
85 90 95 

Asn Leu Ala Ser Leu Glu Asn Phe Pro lie Gly His Leu Lys Thr Leu 
100 105 110 

Lys Glu Leu Asn Val Ala His Asn Leu lie Gin Ser Phe Lys Leu Pro 
115 120 125 

Glu Tyr Phe Ser Asn Leu Thr Asn Leu Glu His Leu Asp Leu Ser Ser 
130 135 140 

Asn Lys lie Gin Ser lie Tyr Cys Thr Asp Leu Arg Val Leu His Gin 
145 150 155 160 

Met Pro Leu Leu Asn Leu Ser Leu Asp Leu Ser Leu Asn Pro Met Asn 
165 170 175 

Phe lie Gin Pro Gly Ala Phe Lys Glu lie Arg Leu His Lys Leu Thr 
180 185 190 
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Leu Arg Asn Asn Phe Asp Ser Leu Asn Val Met Lys Thr Cys lie Gin 
195 200 205 

Gly Leu Ala Gly Leu Glu Val His Arg Leu Val Leu Gly Glu Phe Arg 
210 215 220 

Asn Glu Gly Asn Leu Glu Lys Phe Asp Lys Ser Ala Leu Glu Gly Leu 
225 230 235 240 

Cys Asn Leu Thr lie Glu Glu Phe Arg Leu Ala Tyr Leu Asp Tyr Tyr 
245 250 255 

Leu Asp Asp lie lie Asp Leu Phe Asn Cys Leu Thr Asn Val Ser Ser 
260 265 270 

Phe Ser Leu Val Ser Val Thr lie Glu Arg Val Lys Asp Phe Ser Tyr 
275 280 285 

Asn Phe Gly Trp Gin His Leu Glu Leu Val Asn Cys Lys Phe Gly Gin 
290 295 300 

Phe Pro Thr Leu Lys Leu Lys Ser Leu Lys Arg Leu Thr Phe Thr Ser 
305 310 315 320 

Asn Lys Gly Gly Asn Ala Phe Ser Glu Val Asp Leu Pro Ser Leu Glu 
325 330 335 

Phe Leu Asp Leu Ser Arg Asn Gly Leu Ser Phe Lys Gly Cys Cys Ser 
340 345 350 

Gin Ser Asp Phe Gly Thr Thr Ser Leu Lys Tyr Leu Asp Leu Ser Phe 
355 360 365 

Asn Gly Val lie Thr Met Ser Ser Asn Phe Leu Gly Leu Glu Gin Leu 
370 375 380 

Glu His Leu Asp Phe Gin His Ser Asn Leu Lys Gin Met Ser Glu Phe 
385 390 395 400 

Ser Val Phe Leu Ser Leu Arg Asn Leu lie Tyr Leu Asp lie Ser His 
405 410 415 

Thr His Thr Arg Val Ala Phe Asn Gly He Phe Asn Gly Leu Ser Ser 
420 425 430 

Leu Glu Val Leu Lys Met Ala Gly Asn Ser Phe Gin Glu Asn Phe Leu 
435 440 445 

Pro Asp lie Phe Thr Glu Leu Arg Asn Leu Thr Phe Leu Asp Leu Ser 
450 455 460 

Gin Cys Gin Leu Glu Gin Leu Ser Pro Thr Ala Phe Asn Ser Leu Ser 
465 470 475 480 

Ser Leu Gin Val Leu Asn Met Ser His Asn Asn Phe Phe Ser Leu Asp 
485 490 495 
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Thr Phe Pro Tyr Lys Cys Leu Asn Ser Leu Gin Val Leu Asp Tyr Ser 
500 505 510 



Leu Asn His lie Met Thr Ser Lys Lys Gin Glu Leu Gin His Phe Pro 
515 520 525 

Ser Ser Leu Ala Phe Leu Asn Leu Thr Gin Asn Asp Phe Ala Cys Thr 
530 535 540 

Cys Glu His Gin Ser Phe Leu Gin Trp lie Lys Asp Gin Arg Gin Leu 
545 550 555 560 

Leu Val Glu Val Glu Arg Met Glu Cys Ala Thr Pro Ser Asp Lys Gin 
565 570 575 

Gly Met Pro Val Leu Ser Leu Asn lie Thr Cys Gin Met Asn Lys Thr 
580 585 590 

lie He Gly Val Ser Val Leu Ser Val Leu Val Val Ser Val Val Ala 
595 600 605 

Val Leu Val Tyr Lys Phe Tyr Phe His Leu Met Leu Leu Ala Gly Cys 
610 615 620 

He Lys Tyr Gly Arg Gly Glu Asn He Tyr Asp Ala Phe Val He Tyr 
625 630 635 640 

Ser Ser Gin Asp Glu Asp Trp Val Arg Asn Glu Leu Val Lys Asn Leu 
645 650 655 

Glu Glu Gly Val Pro Pro Phe Gin Leu Cys Leu His Tyr Arg Asp Phe 
660 665 670 

He Pro Gly Val Ala He Ala Ala Asn He He His Glu Gly Phe His 
675 680 685 

Lys Ser Arg Lys Val He Val Val Val Ser Gin His Phe He Gin Ser 
690 695 700 

Arg Trp Cys He Phe Glu Tyr Glu He Ala Gin Thr Trp Gin Phe Leu 
705 710 715 720 

Ser Ser Arg Ala Gly He He Phe He Val Leu Gin Lys Val Glu Lys 
725 730 735 

Thr Leu Leu Arg Gin Gin Val Glu Leu Tyr Arg Leu Leu Ser Arg Asn 
740 745 750 

Thr Tyr Leu Glu Trp Glu Asp Ser Val Leu Gly Arg His He Phe Trp 
755 760 765 

Arg Arg Leu Arg Lys Ala Leu Leu Asp Gly Lys Ser Trp Asn Pro Glu 
770 775 780 

Gly Thr Val Gly Thr Gly Cys Asn Trp Gin Glu Ala Thr Ser He 
785 790 795 



9 



<210> 5 
<211> 3395 
<212> DNA 

<213> Rattus norvegicus 



<400> 5 

tcgagcggcc 

aacaaaacca 

agtatacctt 

ccccagggac 

ggttgcagaa 

attgttcctt 

tattacctac 

aaccaagaac 

caatttctca 

caaggcatgg 

gagtttttcc 

gacaaaaatg 

aaatgtggct 

aaacctagaa 

acagtttcta 

tgactccatt 

taattttaat 

ccatcggttg 

tgtcatggaa 

tttttcagat 

aggtgtacat 

aatcattaga 

gactttaact 

atatctagat 

tggaacaaac 

caacttcatg 

ggtcacagaa 

cactaatacc 

aaaaatggct 

aaacttaaca 

tgacacactc 

tccatcccat 

agagacatcc 

taataattct 

gaaaatgttc 

ctccctggtg 

atcggtggtc 

tcacctgata 

atttgtgatc 

agaagaagga 

agccattgct 

ggtgtctaga 

atggcagttt 

gtccttgctg 

gtgggaggac 

ggatggaaaa 

gacctgagga 

acagttcatc 

caacgacctc 

tatcaggtgt 



gcccgggcag 
aaagtgagaa 
tgaatacaat 
tctgccctgc 
aatgccagga 
tcctgcctga 
caatgcatgg 
ctagatctga 
caacttcagt 
catggcttaa 
ccaggaagtt 
acctctctag 
cataatctta 
catgtggatc 
cgtgaaaatc 
caagcccaag 
agctcaaatg 
atcttgggag 
ggactatgca 
gatatttata 
ataaaacaca 
tgtcatctta 
accaacagag 
cttagtagaa 
aacctgaagt 
ggtctagaag 
ttctcagtgt 
aaaattgact 
ggcaattctt 
ttcctggatc 
tacagactcc 
tataaacagc 
aaaggaatac 
gttgcttgta 
ttggtgaatg 
ttggatttta 
agtgtgcttg 
cttattgctg 
tactcgagcc 
gtgccccgct 
gccaacatca 
cactttatcc 
ctgagtagcc 
aggcagcagg 
aatgctctgg 
gccttgaatc 
gtacaaaact 
tgggtttctg 
acagggcaca 
atgaattaag 



gtttctaact 
tgctaaggtt 
atttacagag 
caccatttac 
tgatgcctct 
gaccaggaag 
atcagaatct 
gcttcaaccc 
ggctggattt 
accagctctc 
tttctggact 
agggtttcca 
tacattcctt 
tttcttataa 
cccaagtcaa 
cctttcaggg 
tactgaaaat 
aatttaaaaa 
atgtgagcat 
atctcaattg 
tagcagatgt 
agccttttcc 
aggatatcag 
atgccatgag 
acttagacct 
agctggaata 
tcttatctct 
ttgatggcat 
tcaaagacaa 
tttctaaatg 
agttattaaa 
tgtactccct 
tgcaacattt 
tatgtgaata 
ttgaacaaat 
cgaattccac 
tggtagccac 
gctgtaaaaa 
agaatgagga 
ttcagctttg 
tccaggaagg 
agagccgttg 
gctctggcat 
tcgaattgta 
ggaggcacat 
cagatgaaac 
ctgcgcctaa 
ctgtggacag 
aggaagtagc 
caactcaggc 



tccctcctga 
ggcactctca 
gggcaaccgc 
agttcgtcat 
cttgcatctg 
cttgaatccc 
cagcaaaatc 
cctgaagatc 
atccaggtgt 
aaccttggta 
aacaaattta 
tattggacag 
taagttgcct 
ctatattcaa 
tctctcttta 
aattaggctc 
gtgccttcaa 
tgaaaggaat 
tgatgagttc 
cttggcaaat 
tcctaggcat 
aaagctgagt 
ctttggtcag 
ctttagaggt 
cagcttcaat 
cctggacttt 
tgaaaaactt 
atttcttggc 
caccctttca 
ccaactggaa 
catgagtcac 
caggactctt 
tccaaagagt 
tcagaatttc 
gaaatgtgca 
ctgttatata 
tgtagcattt 
gtacagcaga 
ctgggtgaga 
ccttcattac 
cttccacaag 
gtgtatcttt 
catcttcatt 
tcgccttctt 
cttctggaga 
atcagaggaa 
aacccattat 
ggaggccagg 
aatgtgatga 
aaagaatcat 



gatgggctta 
cttcctcttg 
tgggagagaa 
gctttctcac 
gctgggactc 
tgcatagagg 
cctcatgaca 
ttaagaagct 
gaaattgaga 
ctgacaggaa 
gagaatctgg 
cttatatcct 
gaatattttt 
actatttctg 
gacctgtctt 
catgaattga 
aacatgactg 
ctggaaagtt 
aggttaacat 
atttctgcaa 
ttcaaatggc 
ctaccttttc 
ttggctctgc 
tgctgttctt 
ggtgtcatcc 
cagcactcca 
ctttaccttg 
ttgatcagtc 
aatgtcttta 
cagatatcta 
aacaacctac 
gattgcagtt 
ctagccgtct 
ttgcagtggg 
tcacctatag 
tacaagacta 
ctgatatacc 
ggagaaagca 
aacgagctgg 
agggacttta 
agccggaaag 
gaatatgaga 
gtccttgaga 
agcagaaaca 
agactcaaaa 
gaacaagaag 
gtttacaatt 
gagcacgagg 
aaccccatac 
aatcagcaaa 



ttaattctag 
ctctctagcc 
ggggcagggg 
ggcctccgct 
tgatcatggc 
tacttcctaa 
tcccttattc 
atagcttcac 
caattgaaga 
accctatcaa 
tggctgtgga 
taaagaaact 
ctaatctgac 
tcaaagactt 
taaacccaat 
ctctaagaag 
gtttacatgt 
ttgaccgttc 
atataaatca 
tgtctttcac 
aatccttatc 
ttaaaagttg 
caagtctcag 
attctgattt 
tgatgagtgc 
ctttaaaaaa 
acatctctta 
tcaacacttt 
caaacacaac 
ggggggtatt 
tgtttctgga 
tcaatcgcat 
tcaatctgac 
tcaaggacca 
acatgaaggc 
tcatcagtgt 
acttctattt 
tctatgatgc 
taaagaattt 
ttcctggtgt 
ttattgtggt 
ttgctcagac 
aagtggagaa 
cctacctcga 
aagccctgtt 
caacaacttt 
tccgaatgct 
cttctaacct 
tttccatgtg 
gtttactctt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 
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ataaaaccta aggagaggag gctaaggccc agtgagaaca gaaaggaaca tcattcttct 3060 

ctggatcttt gaatataagc acaacatgta gtgtgctgca gttaccttag aagagttttg 3120 

atcatttaaa ctgaagtgaa tgtttccttc ctttcccttt ttctattgaa tataatttaa 3180 

atggcactga ctctttttga gagaccctca ttcaaatttc ttcttccatt ttctgtcagt 3240 

ttcttttttt ttaaatctag ttctacaaga aatatgactg atacatgctc aaagatatcc 3300 

tggtcaatcc ttagaatgct atatttataa aataaaaatt tttagtgtac ttttattttt 33 60 

taaaacaaaa aaaaaaaaaa aaaaaaaaaa aaaaa 3395 

<210> 6 
<211> 835 
<212> PRT 

<213> Rattus norvegicus 
<400> 6 

Met Met Pro Leu Leu His Leu Ala Gly Thr Leu lie Met Ala Leu Phe 
15 10 15 

Leu Ser Cys Leu Arg Pro Gly Ser Leu Asn Pro Cys lie Glu Val Leu 
20 25 30 

Pro Asn lie Thr Tyr Gin Cys Met Asp Gin Asn Leu Ser Lys lie Pro 
35 40 45 

His Asp lie Pro Tyr Ser Thr Lys Asn Leu Asp Leu Ser Phe Asn Pro 
50 55 60 

Leu Lys lie Leu Arg Ser Tyr Ser Phe Thr Asn Phe Ser Gin Leu Gin 
65 70 75 80 

Trp Leu Asp Leu Ser Arg Cys Glu lie Glu Thr lie Glu Asp Lys Ala 
85 90 95 

Trp His Gly Leu Asn Gin Leu Ser Thr Leu Val Leu Thr Gly Asn Pro 
100 105 110 

lie Lys Ser Phe Ser Pro Gly Ser Phe Ser Gly Leu Thr Asn Leu Glu 
115 120 125 

Asn Leu Val Ala Val Glu Thr Lys Met Thr Ser Leu Glu Gly Phe His 
130 135 140 

lie Gly Gin Leu lie Ser Leu Lys Lys Leu Asn Val Ala His Asn Leu 
145 150 155 160 

lie His Ser Phe Lys Leu Pro Glu Tyr Phe Ser Asn Leu Thr Asn Leu 
165 170 175 

Glu His Val Asp Leu Ser Tyr Asn Tyr lie Gin Thr lie Ser Val Lys 
180 185 190 

Asp Leu Gin Phe Leu Arg Glu Asn Pro Gin Val Asn Leu Ser Leu Asp 
195 200 205 

Leu Ser Leu Asn Pro lie Asp Ser lie Gin Ala Gin Ala Phe Gin Gly 
210 215 220 
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lie Arg Leu His Glu Leu Thr Leu Arg Ser Asn Phe Asn Ser Ser Asn 
225 230 235 240 

Val Leu Lys Met Cys Leu Gin Asn Met Thr Gly Leu His Val His Arg 
245 250 255 

Leu lie Leu Gly Glu Phe Lys Asn Glu Arg Asn Leu Glu Ser Phe Asp 
260 265 270 

Arg Ser Val Met Glu Gly Leu Cys Asn Val Ser lie Asp Glu Phe Arg 
275 280 285 

Leu Thr Tyr lie Asn His Phe Ser Asp Asp lie Tyr Asn Leu Asn Cys 
290 295 300 

Leu Ala Asn lie Ser Ala Met Ser Phe Thr Gly Val His lie Lys His 
305 310 315 320 

lie Ala Asp Val Pro Arg His Phe Lys Trp Gin Ser Leu Ser lie lie 
325 330 335 

Arg Cys His Leu Lys Pro Phe Pro Lys Leu Ser Leu Pro Phe Leu Lys 
340 345 350 

Ser Trp Thr Leu Thr Thr Asn Arg Glu Asp lie Ser Phe Gly Gin Leu 
355 360 365 

Ala Leu Pro Ser Leu Arg Tyr Leu Asp Leu Ser Arg Asn Ala Met Ser 
370 375 380 

Phe Arg Gly Cys Cys Ser Tyr Ser Asp Phe Gly Thr Asn Asn Leu Lys 
385 390 395 400 

Tyr Leu Asp Leu Ser Phe Asn Gly Val lie Leu Met Ser Ala Asn Phe 
405 410 415 

Met Gly Leu Glu Glu Leu Glu Tyr Leu Asp Phe Gin His Ser Thr Leu 
420 425 430 

Lys Lys Val Thr Glu Phe Ser Val Phe Leu Ser Leu Glu Lys Leu Leu 
435 440 445 

Tyr Leu Asp lie Ser Tyr Thr Asn Thr Lys lie Asp Phe Asp Gly lie 
450 455 460 

Phe Leu Gly Leu lie Ser Leu Asn Thr Leu Lys Met Ala Gly Asn Ser 
465 470 475 480 

Phe Lys Asp Asn Thr Leu Ser Asn Val Phe Thr Asn Thr Thr Asn Leu 
485 490 495 

Thr Phe Leu Asp Leu Ser Lys Cys Gin Leu Glu Gin lie Ser Arg Gly 
500 505 510 

Val Phe Asp Thr Leu Tyr Arg Leu Gin Leu Leu Asn Met Ser His Asn 
515 520 525 
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Asn Leu Leu Phe Leu Asp Pro Ser His Tyr Lys Gin Leu Tyr Ser Leu 
530 535 540 



Arg Thr Leu Asp Cys Ser Phe Asn Arg lie Glu Thr Ser Lys Gly lie 
545 550 555 560 

Leu Gin His Phe Pro Lys Ser Leu Ala Val Phe Asn Leu Thr Asn Asn 
565 570 575 

Ser Val Ala Cys lie Cys Glu Tyr Gin Asn Phe Leu Gin Trp Val Lys 
580 585 590 

Asp Gin Lys Met Phe Leu Val Asn Val Glu Gin Met Lys Cys Ala Ser 
595 600 605 

Pro lie Asp Met Lys Ala Ser Leu Val Leu Asp Phe Thr Asn Ser Thr 
610 615 620 

Cys Tyr lie Tyr Lys Thr lie lie Ser Val Ser Val Val Ser Val Leu 
625 630 635 640 

Val Val Ala Thr Val Ala Phe Leu lie Tyr His Phe Tyr Phe His Leu 
645 650 655 

lie Leu lie Ala Gly Cys Lys Lys Tyr Ser Arg Gly Glu Ser lie Tyr 
660 665 670 

Asp Ala Phe Val lie Tyr Ser Ser Gin Asn Glu Asp Trp Val Arg Asn 
675 680 685 

Glu Leu Val Lys Asn Leu Glu Glu Gly Val Pro Arg Phe Gin Leu Cys 
690 695 700 

Leu His Tyr Arg Asp Phe lie Pro Gly Val Ala lie Ala Ala Asn lie 
705 710 715 720 

lie Gin Glu Gly Phe His Lys Ser Arg Lys Val lie Val Val Val Ser 
725 730 735 

Arg His Phe lie Gin Ser Arg Trp Cys lie Phe Glu Tyr Glu lie Ala 
740 745 750 

Gin Thr Trp Gin Phe Leu Ser Ser Arg Ser Gly lie lie Phe lie Val 
755 760 765 

Leu Glu Lys Val Glu Lys Ser Leu Leu Arg Gin Gin Val Glu Leu Tyr 
770 775 780 

Arg Leu Leu Ser Arg Asn Thr Tyr Leu Glu Trp Glu Asp Asn Ala Leu 
785 790 795 800 

Gly Arg His lie Phe Trp Arg Arg Leu Lys Lys Ala Leu Leu Asp Gly 
805 810 815 

Lys Ala Leu Asn Pro Asp Glu Thr Ser Glu Glu Glu Gin Glu Ala Thr 
820 825 830 
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Thr Leu Thr 
835 



<210> 7 

<211> 24 

<212> DNA 

<213> Mus inusculus 

<400> 7 

tgaacacata tataccaagg cage 

<210> 8 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 8 

accagagggt cattctccaa 

<210> 9 

<211> 26 

<212> DNA 

<213> Mus inusculus 

<400> 9 

caaaatatct gacaaaaaca agtgtg 

<210> 10 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 10 

ggtgtcatca ccatgatgga 

<210> 11 

<211> 23 

<212> DNA 

<213> Mus musculus 

<400> 11 

agtaagcaat gttcactcca acc 

<210> 12 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 12 

tcccagcatt gatgctcac 

<210> 13 

<211> 20 

<212> DNA 

<213> Mus musculus 



<400> 13 

stgtgtgcca ttttgcatgt 



20 



<210> 14 
<211> 24 
<212> DNA 

<213> Mus musculus 
<400> 14 

agtattgctt gataaatttg catg 24 

<210> 15 
<211> 25 
<212> DNA 

<213> Mus musculus 
<400> 15 

gttccgtttc tttttacaac tatgg 25 

<210> 16 

<211> 26 

<212> DNA 

<213> Mus musculus 

<400> 16 

atttgcctat tttattttca tttgtg 26 

<210> 17 
<211> 18 
<212> DNA 

<213> Mus musculus 
<400> 17 

ggaaggttga agcaagac 18 

<210> 18 
<211> 22 
<212> DNA 

<213> Mus musculus 
<400> 18 

gactcatgat ttgataactg ac 22 

<210> 19 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 19 

gccaagaaag agcaaatag 19 

<210> 20 

<211> 19 

<212> DNA 

<213> Mus musculus 
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<400> 20 

cgattcctat ggctcagcc 



<210> 21 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 21 

agtaattcag cttctcccaa 

<210> 22 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 22 

cagatccatg atacagatat gc 

<210> 23 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 23 

cctccagcac agtgtacaat g 

<210> 24 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 24 

gtgtgtgtgt gtgtaagctt g 

<210> 25 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 25 

tagaaagtgg aaacatctga c 

<210> 26 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 26 

atgtaactca atcacagaac tc 

<210> 27 

<211> 20 

<212> DNA 

<213> Mus musculus 



<400> 27 

tcaagatcca taacctagac 
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<210> 28 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 28 

agacagacag atagacagaa ag 22 

<210> 29 

<211> 23 

<212> DNA 

<213> Mus musculus 

<400> 29 

gccctgaagg taaatcagta act 23 

<210> 30 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 30 

gctcaggagg tacattgcct 2 0 

<210> 31 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 31 

tcagtttgct tgcattctc 19 

<210> 32 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 32 

aagtatggat gtgtgtgtaa g 21 

<210> 33 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 33 

tgctaagatt gtgatgactg 20 

<210> 34 

<211> 21 

<212> DNA 

<213> Mus musculus 
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<400> 34 

gactaggtga gagaaacaga c 
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<210> 35 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 35 

ttgggctgat agtacaatat ac 22 

<210> 36 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 36 

ggagatttct aatgcttgg 19 

<210> 37 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 37 

tggacaaaca ccacataaca 20 

<210> 38 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 38 

cagactatca gatgactga 19 

<210> 39 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 39 

acattagaat catttcctgc a 21 

<210> 40 

<211> 18 

<212> DNA 

<213> Mus musculus 

<400> 40 

gcaaagtctt gtgagtct 18 

<210> 41 

<211> 21 

<212> DNA 

<213> Mus musculus 



18 



<400> 41 

cttaactgga gaggaaagat c 
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<210> 42 
<211> 22 
<212> DNA 
<213> Mus 



musculus 



<400> 42 

cagttctgtc tttgtatctc tg 22 

<210> 43 

<211> 19 

<212> DNA 

<213> Mus musculus 



<400> 43 

agagagtgag cctcagtct 19 

<210> 44 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 44 

ttgggtgatg attgtgaac 19 

<210> 45 

<211> 2951 

<212> DNA 

<213> Mus musculus 



<400> 45 

cctcctgcga cggggcagat cgattctaga acaaaaccaa aagtgagaat gctaaggttg 60 
gcactctcac ttcctctttg aatatagtac ttgcagaggg gcacccactg ggagggaaga 120 
ggcaggtgtc ccagggactc tgcgctgcca ccagttacag atcgtcatgt tctctcatgg 180 
cctccactgg ttgcagaaaa tgccaggatg atgcctccct ggctcctggc taggactctg 240 
atcatggcac tgttcttctc ctgcctgaca ccaggaagct tgaatccctg catagaggta 300 
gttcctaata ttacctacca atgcatggat cagaaactca gcaaagtccc tgatgacatt 360 
ccttcttcaa ccaagaacat agatctgagc ttcaacccct tgaagatctt aaaaagctat 420 
agcttctcca atttttcaga acttcagtgg ctggatttat ccaggtgtga aattgaaaca 480 
attgaagaca aggcatggca tggcttacac cacctctcaa acttgatact gacaggaaac 540 
cctatccaga gtttttcccc aggaagtttc tctggactaa caagtttaga gaatctggtg 600 
gctgtggaga caaaattggc ctctctagaa agcttcccta ttggacagct tataacctta 660 
aagaaactca atgtggctca caattttata cattcctgta agttacctgc atatttttcc 720 
aatctgacga acctagtaca tgtggatctt tcttataact atattcaaac tattactgtc 780 
aacgacttac agtttctacg tgaaaatcca caagtcaatc tctctttaga catgtctttg 840 
aacccaattg acttcattca agaccaagcc tttcagggaa ttaagctcca tgaactgact 900 
ctaagaggta attttaatag ctcaaatata atgaaaactt gccttcaaaa cctggctggt 960 
ttacacgtcc atcggttgat cttgggagaa tttaaagatg aaaggaatct ggaaattttt 1020 
gaaccctcta tcatggaagg actatgtgat gtgaccattg atgagttcag gttaacatat 1080 
acaaatgatt tttcagatga tattgttaag ttccattgct tggcgaatgt ttctgcaatg 1140 
tctctggcag gtgtatctat aaaatatcta gaagatgttc ctaaacattt caaatggcaa 12 00 
tccttatcaa tcattagatg tcaacttaag cagtttccaa ctctggatct accctttctt 1260 
aaaagtttga ctttaactat gaacaaaggg tctatcagtt ttaaaaaagt ggccctacca 1320 
agtctcagct atctagatct tagtagaaat gcactgagct ttagtggttg ctgttcttat 13 80 
tctgatttgg gaacaaacag cctgagacac ttagacctca gcttcaatgg tgccatcatt 1440 
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atgagtgcca 
ttaaaaaggg 
atctcttata 
aacacattaa 
aacacaacaa 

ggggtatttg 
tttttggatt 
aatcgcatag 
aatcttacta 
aaggaacaga 
atgaatacct 
atcagtgtgt 
ttctattttc 
tatgatgcat 
aagaatttag 
catggtgtag 
attgtggtag 
gctcaaacat 
gttgagaagt 
tacctggaat 
gccctattgg 
gcaacttgga 
taaatgctac 
gctaagctat 
aggggtatga 
tgaaggagaa 



atttcatggg 
tcacagaatt 
ctaacaccaa 
aaatggctgg 
acttgacatt 
acaccctcca 
catcccatta 
agacatctaa 
acaattctgt 
agcagttctt 
ccttagtgtt 
cagtggtcag 
acctgatact 
ttgtgatcta 
aagaaggagt 
ccattgctgc 
tgtctagaca 
ggcagtttct 
ccctgctgag 
gggaggacaa 
atggaaaagc 
cctgaggaga 
agctcacctg 
agcatggacc 
attacctaac 

g 



tctagaagag 
ctcagcgttc 
aattgacttc 
caattctttc 
cctggatctt 
tagacttcaa 
taaccagctg 
aggaatactg 
tgcttgtata 
ggtgaatgtt 
ggattttaat 
tgtgattgtg 
tattgctggc 
ctcgagtcag 
gccccgcttt 
caacatcatc 
ctttattcag 
gagcagccgc 
gcagcaggtg 
tcctctgggg 
ctcgaatcct 
acaaaactct 
gggctctgct 
ttaccgggca 
tcgggaaaag 



ctgcagcacc 
ttatcccttg 
gatggtatat 
aaagacaaca 
tctaaatgtc 
ttattaaata 
tattccctca 
caacattttc 
tgtgaacatc 
gaacaaatga 
aattctacct 
gtatccactg 
tgtaaaaagt 
aatgaggact 
cacctctgcc 
caggaaggct 
agccgttggt 
tctggcatca 
gaattgtatc 
aggcacatct 
gagcaaacag 
ggggcctaaa 
atggaccgag 
gaaggaagta 
aaacataatc 



tggattttca 
aaaagctact 
ttcttggctt 
ccctttcaaa 
aattggaaca 
tgagtcacaa 
gcactcttga 
caaagagtct 
agaaattcct 
catgtgcaac 
gttatatgta 
tagcatttct 
acagcagagg 
gggtgagaaa 
ttcactacag 
tccacaagag 
gtatctttga 
tcttcattgt 
gccttcttag 
tctggagaag 
cagaggaaga 
cccagtctgt 
agcccatgga 
gcactgacac 
cagaatcttt 



gcactctact 
ttaccttgac 
gaccagtctc 
tgtctttgca 
aatatcttgg 
caatctattg 
ttgcagtttc 
agccttcttc 
gcagtgggtc 
acctgtagag 
caagacaatc 
gatataccac 
agaaagcatc 
tgagctggta 
agactttatt 
ccggaaggtt 
atatgagatt 
ccttgagaag 
cagaaacacc 
acttaaaaat 
acaagaaacg 
ttgcaattaa 
acacatggct 
cttcctttcc 
acctttaatc 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
2951 



<210> 46 

<211> 2951 

<212> DNA 

<213> Mus musculus 



<400> 46 

cctcctgcga 

gcactctcac 

ggcaggtgtc 

cctccactgg 

atcatggcac 

gttcctaata 

ccttcttcaa 

agcttctcca 

attgaagaca 

cctatccaga 

gctgtggaga 

aagaaactca 

aatctgacga 

aacgacttac 

aaccoaattg 

ctaagaggta 

ttacacgtcc 

gaaccctcta 

acaaatgatt 

tctctggcag 

tccttatcaa 

aaagtttgac 

gtctcagcta 

ctgatttggg 



cggggcagat 
ttcctctttg 
ccagggactc 
ttgcagaaaa 
tgttcttctc 
ttacctacca 
ccaagaacat 
atttttcaga 
aggcatggca 
gtttttcccc 
caaaattggc 
atgtggctca 
acctagtaca 
agtttctacg 
acttcattca 
attttaatag 
atcggttgat 
tcatggaagg 
tttcagatga 
gtgtatctat 
tcattagatg 
tttaactatg 
tctagatctt 
aacaaacagc 



cgattctaga 
aatatagtac 
tgcgctgcca 
tgccaggatg 
ctgcctgaca 
atgcatggat 
agate tgagc 
acttcagtgg 
tggcttacac 
aggaagtttc 
ctctctagaa 
caattttata 
tgtggatctt 
tgaaaatcca 
agaccaagcc 
ctcaaatata 
cttgggagaa 
actatgtgat 
tattgttaag 
aaaatatcta 
tcaactaagc 
aacaaagggt 
agtagaaatg 
ctgagacact 



acaaaaccaa 
ttgcagaggg 
ccagttacag 
atgcctccct 
ccaggaagct 
cagaaactca 
ttcaacccct 
ctggatttat 
cacctctcaa 
tctggactaa 
agcttcccta 
cattcctgta 
tcttataact 
caagtcaatc 
tttcagggaa 
atgaaaactt 
tttaaagatg 
gtgaccattg 
ttccattgct 
gaagatgttc 
agtttccaac 
ctatcagttt 
cac tgagc tt 
tagacctcag 



aagtgagaat 
gcacccactg 
atcgtcatgt 
ggctcctggc 
tgaatccctg 
gcaaagtccc 
tgaagatctt 
ccaggtgtga 
acttgatact 
caagtttaga 
ttggacagct 
agttacctgc 
atattcaaac 
tctctttaga 
ttaagctcca 
gccttcaaaa 
aaaggaatct 
atgagttcag 
tggcgaatgt 
ctaaacattt 
tctggatcta 
taaaaaagtg 
tagtggtggc 
cttcaatggt 



gctaaggttg 
ggagggaaga 
tctctcatgg 
taggactctg 
catagaggta 
tgatgacatt 
aaaaagctat 
aattgaaaca 
gacaggaaac 
caatctggtg 
tataacctta 
atatttttcc 
tattactgtc 
catgtctttg 
tgaactgact 
cctggctggt 
ggaaattttt 
gttaacatat 
ttctgcaatg 
caaatggcaa 
ccctttctta 
gccctaccaa 
tgttcttatt 
gccatcatta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 



20 



tgagtgccaa tttcatgggt ctagaagagc 
ttaaaaaggg tcacagaatt ctcagcgttc 
atctcttata ctaacaccaa aattgacttc 
aacacattaa aaatggctgg caattctttc 
aacacaacaa acttgacatt cctggatcct 
ggggtatttg acaccctcca tagacttcaa 
tttttggatt catcccatta taaccagctg 
aatcgcatag agacatctaa aggaatactg 
aatcttacta acaattctgt tgcttgtata 
aaggaacaga agcagttctt ggtgaatgtt 
atgaatacct ccttagtgtt ggattttaat 
atcagtgtgt cagtggtcag tgtgattgtg 
ttctattttc acctgatact tattgctggc 
tatgatgcat ttgtgatcta ctcgagtcag 
aagaatttag aagaaggagt gccccgcttt 
cctggtgtag ccattgctgc caacatcatc 
attgtggtag tgtctagaca ctttattcag 
gctcaaacat ggcagtttct gagcagccgc 
gttgagaagt ccctgctgag gcagcaggtg 
tacctggaat gggaggacaa tcctctgggg 
gccctattgg atggaaaagc ctcgaatcct 
gcaacttgga cctgaggaga acaaaactct 
taaatgctac agctcacctg gggctctgct 
gctaagctat agcatggacc ttaccgggca 
aggggtatga attacctaac tcgggaaaag 
tgaaggagaa g 

<210> 47 

<211> 18989 

<212> DNA 

<213> Homo sapiens 

<400> 47 

tcccctactt tcttcacatt ctgcagtaaa 
taatgaaata aaagaagcct agaaccagga 
tcctcattga gtgtttcatg agccaggaaa 
gtttatacag caattagctt agtctaacac 
tatgcattac ctgccaaata aaagcaaaca 
tccattgact gactgattat gggtcctgag 
gatcaagtgg cttgttttgc ctctggccaa 
attatgtaca acaaaccccc atgacacacg 
cacatgtact tctgaatgta aaaataaaag 
attctactgt gggagaaatt ataggcccat 
attataatac tgcatcggta agttcatcaa 
gcttacagtt tctaatttgt atttgacaca 
cctagatgtc tttgattaag gtcagtaaat 
gaaaaggaaa cctctttaca agtcaccacc 
ttccaggaag acttctataa aacacatgta 
cttccctggg cccttttcct tagttcctca 
tgatgggaag aaagtgcatt tgtctattac 
tgtgtcccag taatggggtc tagttatgtc 
cattcttagg aagtggagtg tttctgtatg 
tttttatcta ggcaataccc agcctgaaat 
ttctatttta gaaagatgtt ttcaacagca 
attgaaagag gtgaatctag actttgggag 
ccgtgggata tgttcaatag taatggtagt 
ttacagatac aatttctaca acttggtgga 



tgcagcacct ggatttttca gcactctact 1500 
ttatcccttg aaaagctact ttaccttgac 1560 
gatggtatat ttcttggctt gaccagtctc 1620 
aaagacaaca ccctttcaaa tgtctttgca 1680 
tctaaatgtc aattggaaca aatatcttgg 1740 
ttattaaata tgagtcacaa caatctattg 1800 
tattccctca gcactcttga ttgcagtttc 1860 
caacattttc caaagagtct agccttcttc 1920 
tgtgaacatc agaaattcct gcagtgggtc 1980 
gaacaaatga catgtgcaac acctgtagag 2040 
aattctacct gttatatgta caagacaatc 2100 
gtatccactg tagcatttct gatataccac 2160 
tgtaaaaagt acagcagagg agaaagcatc 2220 
aatgaggact gggtgagaaa tgagctggta 22 80 
cacctctgcc ttcactacag agactttatt 2340 
caggaaggct tccacaagag ccggaaggtt 2400 
agccgttggt gtatctttga atatgagatt 2460 
tctggcatca tcttcattgt ccttgagaag 252 0 
gaattgtatc gccttcttag cagaaacacc 2580 
aggcacatct tctggagaag acttaaaaat 2640 
gagcaaacag cagaggaaga acaagaaacg 2700 
ggggcctaaa cccagtctgt ttgcaattaa 2760 
atggaccgag agcccatgga acacatggct 282 0 
gaaggaagta gcactgacac cttcctttcc 2880 
aaacataatc cagaatcttt acctttaatc 2940 

2951 



cttggaggct gcatgttgaa tatgaaagta 60 
atcatacctg gggtaatcca atcagaaata 12 0 
acttttatta agtcacaata aaatctggaa 180 
ttgtcagttt tgtgcatatt tcttacagca 240 
cttctaggtc cctggcgaat atgggattcc 300 
ttgaacttgc tctgcatgaa ggatgtaggc 360 
atctctacca ctatgcttaa gatgcgatta 420 
tttacctatg taacaaacct gctcatcctg 480 
taaaaaaaaa gaaaacaaga ggtggttatt 540 
aatggtaact aatcaccacg gtcttacctc 600 
cataagcaag ttagatctga taaccaaggg 660 
tggtctgcct tctggaagag cagcatagaa 720 
gattgagtgt taatcccatt catttcccag 780 
agggattctc caatcacaca taggaaaaat 840 
ttaacatctc cgaaaacata gttgaaagga 900 
tctagactat caagcggttt cctctccaaa 960 
acacttgtat tactctattc acttaagcac 1020 
tggcttgaaa tgacccacat atttgtttct 1080 
tgtatatgtg atgggggtag gccaggagat 1140 
cattattagc atgacatgag ttaaacgtat 1200 
ggatgaagaa tcaattggaa gagctggtac 1260 
gcttcttaaa gtatattgaa ctagtctagg 132 0 
agaaatggcg actgacattt tggaattatt 1380 
acatttttta aaatgtaggt tttattattc 1440 
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ggctatggtg aaaacaacag atcagaagat gatgccactg gaaatatagt ttgttgttta 1500 
cagttcctaa gaagcggggg catgccacac catgcagggc cacattggta gcaccagagt 1560 
ccgtcaggag gcagagggag caagaggaaa ttataggcac aagcttttat tgttgttact 1620 
gcagaaaagg caaggcaagg cagggtaagc agggatagga ctggctagtt tgaataacct 1680 
cagtgggctc tggggtagag ggtctgtctc tagttgtctg gtacctggac ctgtgatgat 1740 
tagggctgaa taacagtgtc tacttgggtg taaaagccag gtagaggagg tggttcagag 1800 
gaagggctct ggattgctta gtgtgcataa ggcatgctcc agagcaaatc ttttgctatt 1860 
ttttagaact aactagccct ggtaagtgca gtctcttccc agatgccaga acatcaagaa 1920 
cacagaaaag aagacaattg ggttaataca tgtttagcat gagaaatgag gaagtaaggg 1980 
aaataaagtc aaagagattt ccaccttgga tgactatgtc aaagtgaaac accattaact 2040 
ttccagggaa ctaaacttta ttgagcacct actctgtgtc aggcactgct ctaaaatctt 2100 
tacatgaata atctcaatac tcagagcaaa gctttgacat ggaggttgtt tttatcttaa 2160 
ctctactggt gtgttgatgg agtctacaag agtttgtgcc cagtccacca caaaatggtc 2220 
cctcacagct tggtttttga cacgttggat tggaagtgct tggaggatat tacagtagaa 22 80 
ctatctagga cttagcatac ataatattcc tgttttaaat caggttctta tttaacagaa 2340 
acttacattg cacttgctac tttccagaca ctgtcctaaa agctttacaa atgccagttc 2400 
atttaatccc aatacaatac tttgagatac atattatcat cttcattcta tccacatttt 2460 
caatcctcat catagctctc atttatggaa tgtaatgatg atgctctaga ctagacgttt 2520 
tacgtaagtt agcttaattc agtaattcaa aacacatgcg attatcttcg ttttaaagac 2580 
cagaaaacta aaggttggta ggtttgtata atttgactac cattgcgtat ctttatttta 2640 
atacatttta taaatgcaag cttctgctat gattaaaagt gattaccaca ttttacagac 2700 
cagaaagtaa taataagtgt tggtgaagat gtgaaaaaat gagaactcct gtacaccatt 2760 
tgtgggaatg taaaatggta cagatgctgt ggagaatcat atggtgggtg ctcaaaaaat 282 0 
taaaaataga tttaccacat gatccagcaa tctcacttct gagtacgtat ccaaaagaat 2880 
tgaaaacaga gactttaaga gatatttgta caaccatgtt tatggcagca ttattcacaa 2940 
tagctaacgt gtggcaacaa tgcaagtgtc catgaacaga caaatggata agcaaaatgt 3000 
ggtctataca tacaatggaa tattgttcag ctttaaaaag gaaggaggct ttgatctata 3060 
ctacacagaa aagaaccttg aggacattat gcaaagtgaa ataagccagt gacaaaaaga 3120 
tacatactgt atgattccac ttctaagagc tgcctagagt agtcaagatt atagagacaa 3180 
aagtagtgca tagattcaag ggcctaggga aaggggaaat ggggagttat ttattaatga 3240 
atagtggtga tgattgtaca aaaatatgaa cataattaat gccactaaat tgtacacata 3300 
caaatggtca agataataaa ttttatgtta tgtcatgtta tgttatgtga ttttaccata 3360 
atacagaaaa tgaaaaaaga aaagaaagaa agtaaagctt agcggtttac atgacttgac 3420 
caatgcctca aagccatgag tcacccagct gagatctgaa cttcagtata ttccattctg 3480 
aaatcccaga cttttcccaa tcttcttgta cttttcaaac tgtgtttcag ttgaggttta 3540 
ttttcagttt tgtatgtgag tttcttcaca agaaggggcg ggccaaattg tgtcctgcaa 3600 
aaacctacat atcgaagtcc taacccctct acctcagact atgactgtat atggagagag 3660 
agccttgaaa gaggtatgta aggtagaatg aggtcattat ggtgggccct aatccaacat 3720 
aactggtgtc cttataagaa ggggagatta gaattcagac acacttgctg acaccttgag 3780 
ttcagactgg aagcctctag aattgtgaga aaatgaatgt ctgttgttta agccacccag 3 840 
tctgtggtat ttccttatgg cagccccagc aaactaatac aaatagtgtt tccacagctg 3900 
aaacaaaatt ggaaaatcac cgtcatccta gagagttaca agggctattt taatagaacc 3960 
tgattgtttt cctaaattca ccaagcccag gcagaggtca gatgactaat tgggataaaa 4020 
gccaactagc ttcctcttgc tgtttcttta gccactggtc tgcaggcgtt ttcttcttct 4080 
aacttcctct cctgtgacaa aagagataac tattagagaa acaaaagtcc agaatgctaa 4140 
ggttgccgct ttcacttcct ctcacccttt agcccagaac tgctttgaat acaccaattg 4200 
ctgtggggcg gctcgaggaa gagaagacac cagtgcctca gaaactgctc ggtcaaacgg 4260 
tgatagcaaa ccacgcattc acagggccac tgctgctcac agaagcagtg aggatgatgc 432 0 
caggatgatg tctgcctcgc gcctggctgg gactctgatc ccagccatgg ccttcctctc 43 80 
ctgcgtgaga ccagaaagct gggagccctg cgtggaggta tgtggctgga gtcagctcct 4440 
ctgaactttc cctcacttct gcccagaact tctcactgtg tgccctggtt tgtttatttt 4500 
tgcaaaaaaa aaaagagtta aattacctta aagactcaag aagccacaga gatcaaataa 4560 
ttcattgtta cagggcacta gaggcagcca ttgggggttt gttccatttg gaaattttga 4620 
gtgctaacag gggcatgaga taacatagat ctgcttaagg tccctgctct gctaccttgt 4680 
ggctctgtga agaaattatc aaacctgtct gagactagtt ttcgcatctg taagagaatt 4740 
ataatacctt cttcactaga gagtaagcag actgcttcag tgtcatttct tcccactggt 4800 
ggtctttaca ctcagcttca agcagtcacc ctgctccttt caatctcagg aaaaagatgg 4860 
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cttttgtgtg tgtgtctcta gagaaagaac 
agtaatatag tttagtccag aggatgaaaa 
gagagagaag aaaaaagcaa gagggaaata 
ttcttagtga aataccaatc agctggttgg 
tttcccctgc gcagacttca cagttgcttt 
aaaatgattt acagtgaggc agggtcagaa 
tgaagttgag gatataggag aaaatcaagt 
gaagccatcc cttccctcaa ttccctgccc 
tccctctgct gttaggttaa tggagataga 
ccacagccaa acttgacttc ttttccccat 
tccatccctg cccatcccct tagttccact 
gactcttaca cactaactgc tttcctgatt 
gttcttgttt gttatgtact tgcatttgtt 
caaattactc tccttaaaaa cttttaataa 
cacagggcct ttgcactggc tatttcttct 
ggcaatatta tgtgagccac atatgtactt 
gaaacaagtg aatttaattt cgataatagt 
tttaaaattt taatatatat ttacctatta 
aagtctggaa tttagtatat attttacatt 
tttcttgtgt ttgataactg tgtatggcta 
gtccttttca tgctttaatc actccattca 
cagtcttcct tgattgcccc atgtagagct 
cttatactgc ttaatatttt tttttctaga 
tttctctctc tccctttgta atggaatcgg 
taccactata tttccagtgg ccagcacaca 
ttgaaataat tgctgcagga caaagtctga 
tagatttcac cactcccacc actcaccagc 
gctcacgttg tcctctcctt aggacatgtt 
actcatcttc caagacccac tttaaaatct 
gacttgatct ctgctttctc tgaaccacag 
cttgagatag tgttagcttt gctcctatct 
aagtctggag ctgcatcata tttttctttg 
tacaatttgg acttaagtaa accttcattt 
caaatattta ttgagcacca accatttgcc 
agtgggcaaa cccatcaaac ttgcaatgga 
atcagataga caacataatg ttagatggtt 
caaagaaaga gtgcctggca ctgtttctat 
ttgcaacatt taagcagaca tacgaatgaa 
atattacttc atttctttga atttccattt 
aagtgaaagt atataacaaa caagaactat 
tgataataat taatgttgac atctattgat 
ctctttaaac accttatctc atttaatcct 
aaactacaat tagctggctt ctgcaaggaa 
attagatggg agcgttagag aattaggctt 
cagtgtaaaa acaaagacag cataaagcac 
aatccatgca ctctaaagat ttccccctat 
gccaaggaaa agaatgcagt tgtcaaaatc 
ttgactgcca gacagtctcc ccaatgatat 
tttatttttt gtattttcaa cttttaagtt 
attacataag taaatgtgtg ccatggtgat 
catcatggat ttgtgtgtca tccttgtgca 
caattttagt atatgtgcta ctgcagcaag 
ttacatatga taaaatgagg ctcactgagg 
tttttaaaga cttggcccta aaccacacag 
actccggagc ctcagccctt caccccgatt 
atcacggagg ttagaatgct gagcacgtag 
agacaattta taagcatgaa ttgagtgaat 



tttctaagtg ggtgtcagac ttctgtatgc 4920 
aaataagaga atgaaaaagg aaaagagaga 4980 
tgtataatgt cagctaatgc aacagtttct 5040 
taatcttatt catgatggat ctcttttgtt 5100 
agaaacccat agtagagccg aacagctaag 5160 
actcaagaga gaaaaagcca gctgcagtcc 5220 
aatatttagc aaagactaat tcattatctt 5280 
atagtcctcc tccttgtcct cttctctgta 5340 
ttttctaatt aggctcactg cgagataaaa 5400 
gtaccttttc ctgtcagtcc ctgaagcctg 5460 
gtaaggcagg ccctcatttc ccctggcatt 5520 
ccagtcttct tcctttaact cattctgcac 5580 
gttattattt ttccttaggc ttcaatctaa 5640 
ctctccattg ccattagaac agctttctac 5700 
acctagaatg ctagatcagt gctatccatt 5760 
ttaaagtttt tagtagcctc attaaaaaaa 5820 
tttatttaac ttagcgtatt taaaataatg 5880 
ttgatatttt tacattcctt gtttggtact 5940 
taccacactt ctcaatttac actattcaca 6000 
gtgactaccg tattggtcag tgcagcccaa 6060 
gatctctgat taaatgtccc ctcctcaggg 6120 
ctccagcctc acttatttgc ctcaaatccc 6180 
gcacaacatt ttatattttt gtttgtttat 6240 
taaggaggca ggatcattgc tggttttatt 6300 
gtagccgcta gatgtgtaag tgataaatga 6360 
ggccctcctg atctggcttg ccctcttact 6420 
taatctgagt ttgttttcca ctctttacgt 6480 
tttcttcccc tttccacata tctaaacctt 6540 
tccttttctg ggaagccttt cctgaatcca 6600 
ggcatatttt ctaagcctat tttatggccc 6660 
aaactcttac tctagactgt gagtccattg 6720 
taatgcccac agcacttggc aggaaatgcc 6780 
aatcagttat tcaatcagtt agtgattcag 6840 
agacaccatt ctgagtgctg gagacaaagc 6900 
atacaggaga tgaacaatac gatgagaaca 6960 
gtgcttcctg tgaaagggaa taaaagaggg 7 020 
tagacaatat tgtctttgag gctccatggc 7080 
gatctgcatg tttgaactct gactttgcgc 7140 
tcctcatctt taaatgctta tttgaagatt 7200 
gcaggcgtat ggtaagggat taatgataga 7260 
cacttatact gtagcgggct tttaaataaa 7320 
tcaaacattc tattggtttc aaacaacaga 7380 
ttttgttgga ggaaatgaga gcattcagaa 7440 
acaaagaatg tgggaaagta ggctagaaag 7500 
ttgaccttat ttactaggtt ccaccatggg 7560 
ttctacatca ctttgctcaa gggtcaatga 7620 
tgggccatga ctaaggaagg tctggacatc 76 80 
ggagtattta gaatgatact ggatatttta 7740 
cagaggcaca tgtgcagagc atgcaggttt 7800 
ttgctgcata gatcatgaaa atatggaacg 7860 
ggggccatgc tcatcttctc tgtatccttc 7920 
cacgatattg gatattttat tacctacatt 7980 
tttttctttt gttcgtttta ttttgttttg 8040 
aagagctggc atgaaaccca gagctttcag 8100 
ccattgcttc ttgctaaatg ctgccgtttt 8160 
taggtgctct ttactttcta atctagagta 8220 
ggatggatgg atatatggat ggaaggatgg 82 80 
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acagatggat gaaaggttga ctgaattttg 
atctctggtc taggagaggg gagttgggag 
gtaatcattg caggtggttc ctaatattac 
aatccccgac aacctcccct tctcaaccaa 
gcatttaggc agctatagct tcttcagttt 
gtaatgaatc cacttttaca tactgcacaa 
tattggactg gaaagcttgg tttgtggagt 
acagatgtct tattaactat ataaccttga 
tttctaatct gtgaagtagg cagttggctg 
ggttttcttt taagacagga gagaaaatta 
tgactacttg ctatatgttg tacactctgc 
cacagcccct gttgttaagg aggcatatct 
ggtatggtcc tgggtacata gtatatagtc 
ggaaactgat ataaagatag cgacatataa 
ggtaactaat tgtatttttc agaagagggg 
ttcatgaact gcataagatt caaacaatgt 
ggaattagga aaagacaagt aacccaagga 
gctgtgggca gggaataatg gctaagattt 
gttgtggcag gtgaaatgtc atcttgatat 
catcatcggg gccttatttg caaatcgaac 
ttgtaaaaca tttcaaaaca ttttacatat 
aaggtcacag tataaaaatc ctttttagag 
acctattata ctctaatttg gatatagata 
aaaggcagtg gtatagtgca tagagtcctg 
ccccaacatt tataaccatg taaccttagg 
ttcatatctg atcaatggaa atgatgaaac 
tattaatatg ctgtatatat ttaaattttt 
cttacagaat ttcattaggt ttttaaaata 
tacatggtta tattgcgtaa tgctgaggtg 
gtgagcatag tacccaatag ttagtttttc 
ctagtaatcc ccagggtcta tttttgtcat 
tcctacttat aaagtgagaa ctcatggtat 
aggataatgg ctactagctg catctatgcc 
aaaattttgt caagtactct attaaggtag 
agaagtagtt tttcaccaaa caatgtctct 
ccaataaact tgccccagaa accttaaatc 
tgttaaccca taggagattc tgaattaggg 
tttgcctttt cgtacagtgt tcacatacag 
tgatagacca ggcagccttc tttcccctat 
gaggaaactg aagcttgaat aatttaaatg 
acagaccaga gatttaagtt cacatctcca 
atactctact tctctaaagt gattatcacc 
ctttccttct cacctaatgt atagagcaat 
atataattga aagaaagaac ttcaaaaata 
gataactaaa acctagagag gaaaagaggt 
agaaacacca aacccaagct cttttgtcca 
tatcctttat ctagttacca ataaataaca 
aggaaactcc tgaagaagct aaataaaatg 
tggattaaga aagaacttat ttgtacattg 
aatttttcat tggcttgctg tttgctggct 
ctgaacattg tgtgtagcat ggctttcatt 
ctggttggta aacctctgcc taattgggaa 
ctccaatttc atctctgttc tccaaccatg 
catttttttt ctgtcaacct tggggctttt 
tgttaacttc tactcatctt tcaattttca 
attttcttgg tccacaacgg ttctctggat 
ttttccacaa aactcgctcc tatcacctgt 



tgcttgcaca aaaagaggcc cctctccacc 8340 
accatgcagt aaagatactt catgtcatgt 8400 
ttatcaatgc atggagctga atttctacaa 8460 
gaacctggac ctgagcttta atcccctgag 8520 
cccagaactg caggtgctgg atttatccag 85 80 
ggtgaggtgt tcattgtcct atcatttcat 8640 
ctcatcttca ttcacttatt cattcataca 8700 
gcaagctacc tctattctcc aggtctcagt 8760 
agacagcttc taagggcaat tctaatttta 8820 
gcttaaattc tttcataagc agctatttat 8880 
aagaagacag gcatatattg atatataaca 8940 
tcttgaaaga gttaatacct taaagtcctg 9000 
aacacatttt aattatgatt ttttggatct 9060 
cagtaggtga taaattatgt ttaaactaaa 9120 
ccttctctgt ggtgggtagt caagaaagat 9180 
ctagaatatt aaaactagtg tacaggatag 9240 
gaaagatgtc aagattaaag gaaaacatct 9300 
tcttttctga tgcagggaag tatatcgttt 93 60 
tttaggggaa ccaaattcta aaagggtttt 9420 
tagataatgg atcatgttct ctgcaatggt 9480 
tttttattat agaaattatt gataaagact 9540 
cagacatttc tgtagaagag tgaacatatg 9600 
ggatgtaaca aaggagtaat ggaacaattc 9660 
ttggggtcag aagacctgag ccaagtttac 972 0 
catattactt catctccctt aatcttagtt 9780 
ttattctgct ggattaaatg tgataataaa 9840 
ataaaatata ttttataagc ataaagtatt 9900 
atttcaactt ttatttttga ttcagggatt 99 60 
tagggtacaa tcgataccat cactcaggta 10020 
aacccttgct gctttctctc tatcccctct 10080 
ctttatgtcc atgtgtactc catgtttgga 10140 
ttggctttct gttcctttgt taatttgctt 10200 
attatgttct aaatttcagt ttcctgcatg 10260 
accacctctc cctttttttt ttttcaaaca 10320 
tatgtaattc atcttcaatc cactggatac 10380 
tgtgcttaca gagaggccag cttcccttct 10440 
caagcacaaa agatagcaca atagacatcc 10500 
taactcaact agtcttgtaa gaatgctttg 10560 
agaaatatat atatatttct ttttataggt 10620 
acttatatac attatcattg cttgttagcc 10680 
gaatccaact taaatgtttt ctttgtctta 10740 
aatgtaatga tatagagaca cagcaagacc 10800 
gcagagatag aatgatgggc tataacaatc 10860 
atcaagttca gctgtttgac ttataaatgt 10920 
actcaagatc acacagtagg agaggactgc 10980 
ctcttccagc gttctttcta ctatactgcc 11040 
aaagcttgga ccacaatgct tttattgtct 11100 
ggtggggaat attgtaaatg taattcaggc 11160 
taactgacaa gcacctgcaa tgctgaaagg 11220 
gcatcaaagc cctgtctcta ggacatgtct 11280 
tcttttagga taaaattcaa aaccctttat 11340 
ccttctttct ccacaactcc atattgtaca 11400 
gaagctattt gtcatgattc ctccttgtgt 11460 
gtgtttgctg ttcacttcac ctccttttat 1152 0 
acttaagtgt tctcagagaa acctactttg 115 80 
gtgaactctt atagcacata attttcactt 11640 
tacaagcatt tacctctgat aacaagaact 117 00 
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ttcaaatatc 
tatgtgtaat 
actaacaact 
gaaaatggct 
tatgtcaatt 
tgtatgccta 
tgtggggctt 
gcacaatcat 
attttaggtt 
gtgaaatcca 
tattgacagg 
tacagaagct 
atctcaaaac 
ctgagtattt 
aaagtattta 
tagacctgtc 
ttcataagct 
aaggtctggc 
acttggaaaa 
tccgattagc 
caaatgtttc 
ataatttcgg 
tgaaactcaa 
cagaagttga 
aaggttgctg 
tcaatggtgt 
atttccagca 
acctcattta 
atggcttgtc 
ttccagatat 
tggagcagtt 
gccacaacaa 
ttcttgatta 
caagtagtct 
agagtttcct 
aatgtgcaac 
agatgaataa 
cagttctggt 
gtagaggtga 
taaggaatga 
actacagaga 
ataaaagccg 
tctttgaata 
tcattgtcct 
ttctcagcag 
ggagacgact 
gtacaggatg 
gaggcatttc 
cacatgtcag 
gctaatctca 
tccaggtggg 
ctcttacctc 
tgagtctttt 
ttgactgaac 
atgactgcag 
ccttacagag 
caaccatcct 



tagctgtcat 
gtttcgtatc 
atccatatta 
aaacttgatc 
attgatcttt 
acatgacaaa 
cttatttgct 
atgacccatc 
cttattcagc 
gacaattgaa 
aaaccccatc 
ggtggctgtg 
tttgaaagaa 
ttctaatctg 
ttgcacagac 
cctgaaccct 
gactttaaga 
tggtttagaa 
gtttgacaaa 
atacttagac 
ttcattttcc 
atggcaacat 
atctctcaaa 
tctaccaagc 
ttctcaaagt 
tattaccatg 
ttccaatttg 
ccttgacatt 
cagtctcgaa 
cttcacagag 
gtctccaaca 
cttcttttca 
cagtctcaat 
agctttctta 
gcaatggatc 
accttcagat 
gaccatcatt 
ctataagttc 
aaacatctat 
gctagtaaag 
ctttattccc 
aaaggtgatt 
tgagattgct 
gcagaaggtg 
gaacacttac 
cagaaaagcc 
caattggcag 
ttgcccagct 
gccttatgct 
a.gga.gcttcc 
catttcaacc 
atcaagttga 
gaatggaaat 
tgggtgttca 
tcgtcaaggg 
gttaaagtct 
ggtcattctc 



gtaagcactt 
tctgaaattg 
tctgtaccaa 
caaggctatt 
aactgatttc 
aaagagccta 
tattccatca 
acatctgtat 
agaaatatta 
gatggggcat 
cagagtttag 
gagacaaatc 
cttaatgtgg 
accaatctag 
ttgcgggttc 
atgaacttta 
aataattttg 
gtccatcgtt 
tctgctctag 
tactacctcg 
ctggtgagtg 
ttagaattag 
aggcttactt 
cttgagtttc 
gattttggga 
agttcaaact 
aaacaaatga 
tctcatactc 
gtcttgaaaa 
ctgagaaact 
gcatttaact 
ttggatacgt 
cacataatga 
aatcttactc 
aaggaccaga 
aagcagggca 
ggtgtgtcgg 
tattttcacc 
gatgcctttg 
aatttagaag 
ggtgtggcca 
gttgtggtgt 
cagacctggc 
gagaagaccc 
ctggagtggg 
ctgctggatg 
gaagcaacat 
gggtccaaca 
aagggtgagt 
agtgcagagg 
aactcagtca 
ataaagacag 
tgtattatgt 
ctttttcctt 
gctcctgatg 
agtggctaat 
gagcatgttc 



ttcataaaca 
atatttacca 
tcagatgtat 
acatgcttta 
cttcttatgg 
tcattgcagc 
tcatctgtcc 
gaagagctgg 
gataatcaat 
atcagagcct 
ccctgggagc 
tagcatctct 
ctcacaatct 
agcacttgga 
tacatcaaat 
tccaaccagg 
atagtttaaa 
tggttctggg 
agggcctgtg 
atgatattat 
tgactattga 
ttaactgtaa 
tcacttccaa 
tagatctcag 
caaccagcct 
tcttgggctt 
gtgagttttc 
acaccagagt 
tggctggcaa 
tgaccttcct 
cactctccag 
ttccttataa 
cttccaaaaa 
agaatgactt 
ggcagctctt 
tgcctgtgct 
tcctcagtgt 
tgatgcttct 
ttatctactc 
aaggggtgcc 
ttgctgccaa 
cccagcactt 
agtttctgag 
tgctcaggca 
aggacagtgt 
gtaaatcatg 
ctatctgaag 
cttgttcagt 
aattccatgg 
gaataaatgc 
aggaacccat 
agaaaacaga 
tatagccatc 
tttgattgaa 
caagatgccc 
tcctaaggaa 
tattttttaa 



ttaagagtat 
gtcatttatc 
aatcacaatt 
tcaactgcac 
attttctcct 
cagtatgata 
tgcttgatgt 
atgactagga 
gtctttttat 
aagccacctc 
cttttctgga 
agagaacttc 
tatccaatct 
cctttccagc 
gcccctactc 
tgcatttaaa 
tgtaatgaaa 
agaatttaga 
caatttgacc 
tgacttattt 
aagggtaaaa 
atttggacag 
caaaggtggg 
tagaaatggc 
aaagtattta 
agaacaacta 
agtattccta 
tgctttcaat 
ttctttccag 
ggacctctct 
tcttcaggta 
gtgtctgaac 
acaggaacta 
tgcttgtact 
ggtggaagtt 
gagtttgaat 
gcttgtagta 
tgctggctgc 
aagccaggat 
tccatttcag 
catcatccat 
catccagagc 
cagtcgtgct 
gcaggtggag 
cctggggcgg 
gaatccagaa 
aggaaaaata 
taataagtat 
tgcactagat 
tagactaaaa 
gacaaagaaa 
aagagacatt 
ataaaaccat 
tacaatttaa 
cttccatttt 
acctgattaa 
ctaatcaccc 



ctgtgacact 
ttggctacca 
ttgtgtgaca 
aatctttata 
ctgcttatca 
atactcagtc 
ctttgcctat 
ttaatattct 
tcctgtaggt 
tctaccttaa 
ctatcaagtt 
cccattggac 
ttcaaattac 
aacaagattc 
aatctctctt 
gaaattaggc 
acttgtattc 
aatgaaggaa 
attgaagaat 
aattgtttga 
gacttttctt 
tttcccacat 
aatgcttttt 
ttgagtttca 
gatctgagct 
gaacatctgg 
tcactcagaa 
ggcatcttca 
gaaaacttcc 
cagtgtcaac 
ctaaatatga 
tccctccagg 
cagcattttc 
tgtgaacacc 
gaacgaatgg 
atcacctgtc 
tctgttgtag 
ataaagtatg 
gaggactggg 
ctctgccttc 
gaaggtttcc 
cgctggtgta 
ggtatcatct 
ctgtaccgcc 
cacatcttct 
ggaacagtgg 
aaaacctcct 
taaatgctgc 
atgcagggct 
tacagagtct 
gtcatttcaa 
gttcttttcc 
tttggtagtt 
attctacttg 
aagtctgtct 
cacatgctca 
ctgatatatt 



11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
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tttattttta tatatccagt tttcattttt 
ataaggttgt ttaagacgtg cttcaaatat 
tggaaaagta cactctgtca ctttgtcact 
gtaatgactg tcatgaaagc agcattgaaa 
cgggaagaaa atttccgctt cctggtctta 
gaagtgggat gacctcagga ggtcaccttt 
acccggggtg acctcatgaa atgagttgca 
tgtttgatgg acctctgaat ctctttaggg 
acccttctca ctgccaggag aactacgtgt 
ctgtttcctg ttgggcaatg ctccttgacc 
agaaaacaat gtgtctggaa ttaatggggt 
ttcatccagc ctcctcagaa acagaacatt 
gaaatgaaaa taaaaaccac aatgagatat 
aaaaaatgaa gtgtcatcaa ggatatagag 
aatggaaaat ggtgtagccg ttatgaaaaa 
agaactgcta tatgatccag caatctcact 
agaatttcaa gaaaatattt acactcccat 
ttccaaagtt atggaaacaa cccaaatttc 
gcatatacgt acaatgggat attattcagc 
acaacatgaa taaacccgga ggccattatg 
aaatactgcc tgatttcatt tatatgaggt 
aatagaacag tggttcctag ggaaaaggag 
ctaattggta taaaattata gtatgcaaga 
agagttcgta taatgaacaa tactgtatta 
ctcatgttaa gtgttcttac catatacata 
atatttatta ccttgattgt ggtgatggtt 
caaattgtat acattaaata tatgcagttt 
taaaaaagaa aagacaacaa aattcagttg 
gtgtttgtta ctgagtgttt cagagtgtgt 
atccctgggt gtgtttccat gtctcatgta 
catatcccta tgtatcccta tcagggctgt 
tcatatctgt atagaagaga gtgtgattat 
gatgtctatg gctgtttgag atgagttctc 
tatcccttat gcttggtgga tacgttctta 
ggtaccaaac ctcatatatg caatattttt 
atcttctgaa ttaggcacag taagagatta 
taataatata ttgtaataaa agttatgtga 
tatcttactg tactgtactc acctattttc 
gtgaaactgt ggataagtga ggaactaaca 
tgcctcagtt tcttagtatg cttgaaaatg 
aagtgtggtt ctattcatat ttgaatatga 
tgtacaaggc tctgagaagg aagatcaaca 
ttcaaggctg acattggcct ccttcctatc 
atggcaggtg aggtggacaa gcagggaaag 
aagagtaaag tcaggcctca gccagcctct 
agggataaag ggtggaataa ggataaatta 
ataccttaca acttgaaaca tattcacaaa 
taaggtagca aatcataatt aactttttcc 
gaagtaaatt tttcaatagc ccacagagta 
caggtcactc aaggacactt tctttcttcc 
aaaacaggaa gtgaaactcc tgcagttttc 
gttgaagcca tgaatcatta agccaataca 
ttatttataa ccctatcttt ccataaagga 
aacaataggg ttaatcaata ataagtagaa 
aagacaccga gaatgagcgt taatacagtg 
tggcttttgg agaaagccaa aactcaaatt 
catttctctg agagtctact tagctccaat 



ttacgtcttg cctataagct aatatcataa 15180 
ccatattaac cactattttt caaggaagta 15240 
cgatgtcatt ccaaagttat tgcctactaa 153 00 
taatttgttt aaagggggca ctcttttaaa 15360 
tcatggacaa tttgggctag aggcaggaag 15420 
tcttgattcc agaaacatat gggctgataa 15480 
gcagaagttt atttttttca gaacaagtga 15540 
agacacagat ggctgggatc cctcccctgt 15600 
gaaggtattc aaggcaggga gtatacattg 15660 
acattttggg aagagtggat gttatcattg 15720 
tcttataaag aaggttccca gaaaagaatg 15780 
caagaaaagg acaatcagga tgtcatcagg 15840 
caccttatac caggtagaat ggctactata 15900 
aaattggaac ccttcttcac tgctggaggg 15960 
cagtacggag gtttctcaaa aattaaaaat 1602 0 
tctgtatata tacccaaaat aattgaaatc 16080 
gttcattgtg gcactcttca caatcactgt 16140 
cattgaaaaa taaatggaca aagaaaatgt 162 00 
ctaaaaaaag ggggaatcct gttatttatg 16260 
ctatgtaaaa tgagcaagta acagaaagac 16320 
tctaaaatag tcaaactcat agaagcagag 163 80 
gaagggagaa atgaggaaat agggagttgt 16440 
tgaattagct ctaaagatca gctgtatagc 16500 
tgcacttaac attttgttaa gagggtacct 16560 
tacacaagga agcttttgga ggtgatggat 16620 
tgacaggtat gtgactatgt ctaaactcat 16680 
tataatatca attatgtctg aatgaagcta 16740 
tcaaaactgg aaatatgacc acagtcagaa 16800 
ttggtttgag caggtctagg gtgattgaac 16860 
ctagtgaaag tagatgtgtg catttgtgca 16920 
gtgtatttga aagtgtgtgt gtccgcatga 169 80 
atttcttgaa gaatacatcc atttgaaatg 17040 
tactcttgtg cttgtacagt agtctcccct 17100 
gaccccaagt ggatctctga gaccgcagat 17160 
tcctatacat aaatacctaa gataaagttc 1722 0 
acaataacta acaataaaat tgaatagtta 17280 
atgtgatctc tttctttctc tctctcaaaa 17340 
agaccataac tgaccatgaa acctgggaaa 17400 
tacatacatg attgtttatc tacagatgta 17460 
tatgattttg tgtatatccg tgctacatgt 17520 
attctgcata agtgtgttta ttcaagcaaa 17580 
tacaacttgg aatatttcaa ggccgaaata 17640 
agttccctct cccagatgga aattctagaa 17700 
aaattatatg catagaacag aaggagaaga 17760 
ttttagctct ttaaatcctc tggatttaag 17820 
atgccaattg taatgcctta aatttgtgtg 17880 
actatatatt tgaatatctc attagctgag 17940 
attttattga tgggaaagct gaagttcaat 18000 
ggaaagtgac aaaacctgag cctgggcctc 18060 
acacccaatt gcttcatgct taaagttggc 1812 0 
tgtgtggttg acactagcaa gggtttctca 18180 
tatgcatata tgttatacat accaaatgat 18240 
cttgaaggag cttcaaacaa aggatatgtg 183 00 
aatctggaca tagaataaaa agaggagaga 18360 
ctttccattt ttctggtgtt ttgagtagcg 1842 0 
cactccttat caactgtgtg ccttgggctc 18480 
gtaaaataag aatagaacta tgactttgta 18540 
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aggttgctct 
ggtgagtact 
tcagtaacaa 
taactaaaat 
gtcttcttct 
aatgaactca 
ccagtgtctc 
gtcttaattc 



aaggattgaa 
cccctaagac 
cactaaagta 
tagacatttc 
ggactctata 
cacaatatgt 
catttacttg 
atgaabydca 



aatcatgtat 
tttattaaga 
attgctattt 
tttctgattc 
aactttttag 
gacagtgaac 
tgtgattgtg 
bydcabydc 



<210> 48 

<211> 50000 

<212> DNA 

<213> Mus musculus 



tatgttcaat 
gggcactagg 
ttccagcctg 
attctctact 
aaatcatcag 
tccctgggag 
ggcaagtcat 



agaagcactg 
tggaaccaca 
cacgggattg 
caggctcctg 
agtgaaaacc 
tcaagtgctt 



tgtccttatg 
ggaggtcttc 
gaagtgactg 
tcagacccca 
gagaagctta 
aaagtctaag 
tgaggctcag 



18600 
18660 
18720 
18780 
18840 
18900 
18960 
18989 



<400> 48 

tttcacatcc 

gatatatgga 

agtgagtctg 

taagtctcaa 

tttatggcct 

agacatgaat 

ctatctatgt 

ccttcccgtc 

acttccctct 

acaatgtcca 

gtgtgttccc 

ttgtattaac 

atttgtagat 

aatcttgggt 

taatacatat 

tgaatatagt 

cccgtcatga 

gctccatcct 

atgaaaaata 

atttaggatc 

tgtataagat 

taaagtaaaa 

gtatgcaagg 

tttatataca 

aacagctata 

tataagagtt 

gtccctattg 

gtagaatttg 

tgcatagttt 

ggttttattc 

ctagcccaag 

tctttctgct 

aattttctaa 

gaatcaccac 

gaatctttaa 

ccatcttcca 

tgttgatatg 

gggccaccta 

tgtcaggaac 

tctttttgta 

ctgacagaca 

ctcaattagc 



atgataggtc 
gatagaaaat 
aattctgcat 
ggtagcaggg 
tcaacctact 
ggaatgtaaa 
ttacatgtaa 
agaatatgta 
ttgtgtatat 
tatttgcttc 
tcgccctgtt 
atagtttaag 
tgtagtatgt 
aaatgatttt 
gaggtattaa 
cttatgactg 
ttgtttagga 
aaaaccccct 
gtgagccact 
tggaaaataa 
atagatatga 
tttttgtgac 
gaagaaattt 
catttataat 
aacctcatta 
ccccaaaaca 
ctgaaaacac 
gtttgtatga 
ttcagatatg 
atttgtatgg 
ctgatcaaaa 
ttatccttcc 
aggacattgt 
atgattaatt 
tttttttctt 
ttccctctgc 
tattaaacac 
ttgaaatatg 
cgcctaggaa 
tactctagaa 
tttttcctat 
aaatttcctg 



aagaatgtaa 
aaacaataca 
gccaatgtag 
gagaagaatc 
gggtctggcc 
gtatctttat 
tattaaccat 
ttggaaggta 
attcacacaa 
tctcctaggt 
cattataagt 
tatcttcctt 
gactcactgg 
acaatttatg 
aaagtactcg 
acataagctt 
agtatatcaa 
ggcgactctg 
agaaggtaag 
catttacaca 
ataaagcaca 
ttacatctta 
tatttacata 
tgtaatgtat 
ttaataatga 
atatagccta 
tgacactatg 
ccacaaattg 
tattcatgct 
tttgttgaaa 
attcactgtg 
aagtgctggg 
gatgaatatc 
ttctagagaa 
tgagagattc 
aaatttcccc 
acttagtcta 
aacaaactgt 
aggttaaggc 
tgtgtgagct 
cactgttgca 
cagatcaaca 



tctaagttat 
cagtgggaag 
gagactccag 
ttttcttttg 
cactcacatt 
aagagtgaaa 
aacatgagca 
aaactgaatc 
aacttctttt 
tttggacaat 
gaattaaact 
tatgctaata 
cctaaaccct 
cctcaatttt 
acaaataaaa 
ctaccagttg 
atatatgaaa 
attctgcata 
gggagtaagg 
cttgtccccc 
gattttcatc 
gatttcctct 
aattccatgt 
ttgccacatt 
gaaacattct 
gtgctgttgt 
aactttgaac 
tcttttaacc 
ttgtgctttt 
tttcagtatt 
tagcttcaac 
attacaggca 
cttgtacact 
aaactgcttt 
atacgtgtgc 
caaactgtcc 
tttagtgcta 
tacaaaaggg 
ttgtaggtgg 
ctgagaggca 
tagcctaaca 
taaagataaa 



aaggtttcac 
acctggcaca 
gcaaagctcg 
gaggaattaa 
agagtgcttt 
gactatctgt 
ctgacatttc 
tttttttctt 
agattattct 
tattttccta 
tgctgatact 
aagattgcag 
gctcctgtct 
ttcttataat 
ggttcttggg 
aagtgaagaa 
ttaagcgaat 
tttgcaatgt 
aaagatgtta 
acccctacaa 
tctgaccact 
gatggctttg 
aaaacatata 
gggataacaa 
tttgagtttt 
tttgcagaga 
aaaagaccat 
agcaatgtca 
atttaatttc 
ttgagataag 
tgaccttaaa 
cagcccagct 
tatctttgga 
gtttctgttg 
ccaatacact 
caacttcatg 
tcagtatgtg 
cctcattctt 
ccttcctgga 
agatcccaag 
attcatgggc 
ctcttgtgaa 



ctagtaacca 
ttgtgaggta 
tggtgcagag 
ccctttttag 
gcttagtctt 
gtgtcatgac 
tggattgtga 
tattgctttt 
gttttcttct 
taaaatatta 
ttttaaaagt 
attgaacaaa 
cttacaatgc 
ttgaatgcat 
aaacacttgg 
tggggttcaa 
cttcctctca 
agttttctgt 
aggggttgat 
cattgaaccc 
atcctcttca 
atgaagctag 
aattcatgtg 
tactctcatc 
atcatggaag 
ttggaggtat 
gagggtttcg 
tactggagaa 
cttcttattg 
agctcactct 
cttaagacaa 
tgtggagttt 
gcctgcccat 
ttcatcttta 
ttaatcctag 
acctctctgt 
cattggtgtg 
gataaaagct 
tgtggcctac 
cttcatgcag 
atcagctcac 
ttagtgctgt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 
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ttagatgaat 
gttttaagag 
gctcattcct 
attactaaaa 
gaaagccaca 
atgagcatat 
attttgtgta 
ccgaagaaaa 
atccctccct 
ccatccacac 
atagtggtgt 
acctctttgg 
ctagaaagaa 
tgttaattgc 
gcgcttaatg 
acctgcatct 
ccccagtaaa 
ttccccttcc 
atgagggtgc 
actgtggcat 
ctcttcctat 
tggggtcccc 
aatcaattgc 
tgctgggttt 
tagtggtttg 
aataggtgtg 
agtgctcaag 
acagaaacct 
ccatgtctgc 
agctgtgagc 
ctcttcacag 
cctgtcatgc 
tccttataag 
acttgggaga 
ctccttgtaa 
aattaacata 
aattaaaatt 
agtaaaagta 
aagtacaaaa 
atttgaaact 
ttaaattcat 
aagacagcat 
aaattgaccc 
acataaaacc 
tgggcaaagg 
atggggcctc 
aaggcaaaca 
caatatatac 
tggggtacaa 
taaagaaata 
agtgtattcc 
acagaaacat 
aatattttaa 
caaaggctaa 
aaaaagtgaa 
tggaattaca 
tataatatac 



taatgatttt 
aaagtttttg 
cataatcata 
taggcttggg 
ggtatgcact 
agaatgaaag 
acattttatc 
gaaataaagg 
ccatccatcc 
atgtccatcc 
agtcattttc 
tacacaagga 
gagcactagc 
cagagaatta 
atgcacaaaa 
gatgttgcgt 
aactgattct 
cggtttttcc 
tccccaaccc 
tgaaccttca 
ggggttgcaa 
gtgctcattt 
caatgagtct 
gtggctatct 
aatatgtttg 
gctttgttgg 
ctctacccag 
tctgactgcc 
ctggatgctg 
aagcctcaat 
caataaaaac 
tctgaatgca 
gctcaggaac 
actactgcaa 
tttaatcccc 
tagcttgatt 
gatgtttaca 
atataagctt 
tttgatagat 
ggagaaagag 
gtcatccact 
ttttaacaaa 
attcttatct 
agataccctg 
ggaaaaattc 
ataaaattgc 
gattgggaaa 
aaacaactca 
gctaaacaaa 
ttgaacattg 
tgaagtgtta 
ctggggaaat 
attaatttta 
tggataataa 
ataatacaaa 
aaaatttcaa 
ctcaaataaa 



atagaattcc 
ttagaaaaat 
agataaagct 
tcaaatttgt 
atgataagac 
actgctttga 
tatgaagtaa 
tgtgatggtt 
atccacacat 
atccatccat 
tgcttcacct 
gttaactagc 
aataaatcct 
tatactaagt 
cagttagaga 
cctacctcag 
ttttaattgg 
tctgcatact 
acttacccac 
tgggaccaag 
accccttcag 
cgatggttgg 
tcagttagga 
tgatttcgtc 
gctcacggga 
aggaagtatg 
tgcagaagag 
ttcagatcaa 
ccatgctttt 
taaatgtttg 
ctacaacaca 
ctgtttgctc 
aattacagaa 
aacagtgagt 
atacctcaaa 
taatttagac 
catttataag 
tttttacatt 
cttaacttgt 
atgccttgtt 
tgatctttga 
tggtgctggc 
ccttaggcaa 
aaatttataa 
ctgagcagaa 
aaagcttctg 
agatctttac 
agaagttaga 
gaattttcag 
ttagtcatca 
taaaaatggt 
aacaacatat 
aaatcaccat 
aaatgtgtta 
ttttaggaat 
agactggacc 
gatgacaact 



tcatttgatt 
gttataaagt 
gcataataag 
attcaaggaa 
aaggtcaagc 
agttagtatc 
tttgctaata 
tggcttggag 
gtccatctat 
ccatccatcc 
agtatatatg 
caggcctgag 
ctactgaatt 
ttataaagta 
actaaaaggc 
cttgttcccc 
ttattatatt 
ccccatcccc 
tcttgcctca 
ggcctcctgt 
ctccttcagt 
cttcaagcat 
gtcgggcttc 
cagatgaact 
tgacactatc 
ttaaagtatt 
agcttctttt 
aatgcagaac 
tgacattatg 
tatttatgag 
tagcttctgt 
agctttgcat 
gagtgggtaa 
tccagacaca 
ccaagcacat 
ataatcagtt 
aaaattaaca 
ttcctaaagt 
ttcttttttc 
tactcaggtt 
caaaggagct 
tcaactgtct 
agctcaagtc 
aggagagagt 
caccagtggc 
taatgcaaag 
caatcctaca 
ctccagagaa 
ctgaggaata 
gggaaatgca 
ccttaaacct 
ttactattta 
gtctatctta 
tatgtatacc 
gtgcatggat 
aatagtcctt 
ttgagggttt 



catagaattt 
tagaatcaag 
gaatacagtg 
aaaacattca 
aaaactgttg 
agcctcctcc 
actgtttatg 
gggctctgca 
ccatccctcc 
atccatccat 
tattcctgtg 
aagggcccct 
gctccctgct 
aataagaatt 
cagagatcat 
taagccagca 
tgtttacatt 
tccagctgcc 
ctgccctagc 
ccaattgatg 
cctttctcta 
tctcctctgc 
ataggtttca 
ctagatgaac 
aggaggtata 
ggagggcttt 
ttcttgtctg 
tctagggtcc 
ataatggatt 
aattgccttg 
aaatttatgt 
agcttatcta 
agatgttgta 
actctctctt 
ctttcacact 
gctactggag 
aattatttgt 
cagttcctta 
aaagcaattt 
aaaatgctga 
aaaaccatac 
gtcagcatgt 
caagtggatc 
ggagaagagg 
ttaagatcaa 
gacactgtca 
tccaatagag 
ccaaataacc 
ttgaatggcc 
aatcaaaaca 
aatgacctga 
aaatactgaa 
aaatgtcatt 
atgagatttt 
ttaaaaaatt 
attcagaagg 
gatatgtgtt 



taagaagaaa 
aatagaatat 
agctttcaca 
ggtccaagga 
ctttgaattt 
tgtaaattcc 
tataaaaagg 
agactcaccc 
ctccatccat 
ccatccagtt 
agtgactttt 
ggcctgctgg 
atacagcata 
aagctttaca 
caatcttttg 
gccccctgac 
tcacatgtta 
ccctgcttct 
attcacctat 
ccccataagg 
actcctccac 
atttttcagg 
actccatcca 
tccttggatg 
accttattgg 
gaggtttctt 
actgcccaag 
ttctccagca 
gaacctctga 
gtcatggtgt 
gtgcaacata 
caataacatt 
agagccattg 
caatgtggtg 
ctgttcccca 
gacttcctgc 
agtgcaatta 
gatttttctt 
agcaaatatt 
caatgaggtc 
agttgaaaaa 
acaaaaatgc 
aagaacctct 
cttgaacaca 
gaatctacaa 
ataggacaaa 
ggctaatatt 
ctattaaaaa 
aagaatcacc 
accctgagaa 
ggagagtaat 
gaaaatgtgg 
aaactatcac 
agacagaaaa 
atactcagac 
acaaatacta 
taatatggct 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
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gcagagggct gtttaagttt atggaacttg 
taaagatgga ggaagaacta agacaatatc 
attggtgggg aaaaggtaca gccatggcat 
aacaaactat atgtaaaagt gcatagtgga 
ataaagcaat agtgaatact ctacaaaact 
tcagtcaggt tttgctataa aaatattgaa 
attcagagag cttaatgcta taattctttt 
gtatacatct gaaaaaaatg ttcttgatta 
gaatttactg tgtagctcac aaatggaaag 
agagaacttt cataagcact gctactcaac 
aagtagacct ggaaagggcc ctcatcagaa 
ataatttttg agttgttcaa agtcagacag 
ttgtatagga agaagtggaa aatgatctag 
taatcacttt ttattccctt ttgttttttt 
tttcccagct gtacttaaat tgtttagaat 
tcagtctatt gataaggctt ttctttctag 
atagggtaaa tctcttaatc tcttctagcc 
gttgttgtta attattaaaa tttaagtatc 
gatttcttct aaaaactttg ttatgaggtg 
ttttacatct agaaacaatg gcaagaggtt 
cttcatttca agttgctaac taaaccgcaa 
aatgtgtcag tgattaatgg caatatgatt 
agtcatttaa attcataaac tcactttact 
taggaaccaa tgctcaataa tgcacaccaa 
tattcacata tatttcattg gctaagtgaa 
agagaagttt aagtgtgctt tagtgaataa 
ggacttaata atgcaatcaa agagatcctc 
gaaatatcat atgctgtacc atacaggaac 
tggtagtaaa agttcattga gcccttgtct 
caagctttca caaaaccaag atatactagt 
^gggtaagaa atagctgaac acattcagtt 
ctaggataga atgcatgaag ccctgtcata 
gagagagaga gagagagaga gagagagaga 
gaaggaagga aggaaggaag gaaggaagga 
gaaggaagga gggaaaagtt aataagtaca 
acttgggtat ctccatgaag gataaatctg 
agaggacatt tagagtagtt ttgtaattta 
gaatacatga aaagctatga aacagtgact 
aaattgtgga tgtcaaagtt atgagacatg 
tagtatttat ttttattttt aaaattaatt 
ctcctctcct cccagtctct ttctcttacc 
tcagagaaag ggaagactcc catggacatt 
gactaagtac atctcctatt cagccttgag 
aaggcaggca acagagttat agacagctgc 
agaccaagct gcacatctat tacatatgtg 
ctctggttgg cagttcaatc tctatgagtc 
gttttcttgt agtgtctttg atgcctctag 
aatattcctc aagtccgcct gatgtttggt 
agactctcag aggacagtta ttctaggttc 
gcatgagtct caaatagttg gtttagtcat 
cctttaccct gtacatctta tagacaggat 
tttttgtcct catccctcca atggaagtct 
ctctgactac taggaatctt agctggagtc 
tcctgggttt ctagtttgtc taagagattc 
atctgtcagt ctcatatttt ctaccattta 
tccaaaatac tgaagaatcc tcacatttcc 
agtctcataa ccaagaaata aaaattaatc 



aaagtggtac atgagagaag gaaaaacttt 6000 
tgagacatga aagtggaaaa tgtgtgtatt 6060 
ggggtgggaa gagattcaga gaaaagcatc 6120 
gccaaccatt tttaagccaa taaacaccaa 6180 
aagtttctat ttagttttac tttcttcttc 6240 
atatgccaag tcctgtcaaa gattaagttt 6300 
caaaatttat aatcacacat atggccatat 6360 
taattaccac tttcccaggc ctccgtttta 6420 
agtaggtcac ctcatgtgaa aataaattac 6480 
caaggggctg gagacacgcc atccagctaa 6540 
aacaacagag gaaatgtcat agagatagaa 6600 
atatattgac atgaagaact ggtcatgtgt 6660 
cattcccaga agctcatagg gactataacc 6720 
ttttttttta atcaatcaat tttttgttga 6780 
cagctcacaa gtaagctgtc cttccaaaag 6840 
cttgtctttg acaaaatagc tcatgacatt 6900 
ttaaaggttt ttgttgttgt tgatgatgat 6960 
actcttgttt tttttttcct gtgccataga 7020 
attagtaaag cacatgtaag ctagatgttg 7080 
tctcttctca ttggtacaaa gtagcatttc 7140 
tccaggctag tctcagtcta ctgacattga 7200 
atgttggtag ctaggttttc aaaccatcct 7260 
tatttggctt atgttacaga ataatgaatg 732 0 
tgtgaaactt caggttgtta tgtctaatta 7380 
tcatgaggta aaaccctaaa tgatcaaagt 7440 
tgacaaatat tgacaggaag aaaaaggtca 7500 
tgacattgaa ataacttatt cctacttagt 7560 
gcatttgaac cagttttaag gaacaagcat 7620 
agcatacaag aatttctggc tttggtttcc 7680 
gcacacttaa aatgtaggaa atatgtcaaa 7740 
tctgacctcc aactcaaagt cggttagagg 7800 
atgaaagaga gagagagaga gagagagaga 7860 
gagagagaga gaaggaagga aggaaagaag 7920 
aggaaggaag gaaggaagga aggaaggaag 7980 
tcatatatca aaactggttg gtacctgtat 8040 
gactagaacc attaactgag gatattgccc 8100 
ctctgcatgt tacattttat tttatattat 8160 
aaacttagtt cattctatta atatagacgg 8220 
ctttattttg tacttgtttt ggcgactatt 82 80 
tgtttacatc acaagcacaa cttctcctcc 8340 
tcctttctct acatccccct cactttctcc 8400 
atcttgcctt ggcatatcaa cttgcagaag 8460 
aaggcatccc agtcagggga gaggagccca 8520 
tgctttattt gttgtaaagg acccacatga 8580 
cagagggttt agatccatcc catgcatgct 8640 
attttgtgcc taggctagtt gaccctgtag 8700 
ctcctttaat ttttcctccc tatcttccac 8760 
tgtggatctc tctatatgtt tactgggtaa 882 0 
ctgcttatca agaatagggt ctctcacatg 8880 
ttataggcca tttccttaat ttctgctcca 8940 
aatttgtggg tcaaaggttt tgtggttggg 9000 
caaaggagat ggccatttca ggttccataa 9060 
acctttatag gttcttggga attttacttt 912 0 
cccaattcta ccaattccag ttttatattc 9180 
tttcttttga tttaacactg tatcaggttt 9240 
ttgactaccc aagagtattc gtagacttaa 9300 
acttcttatt gtgctggatg tttttttgca 93 60 
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atgtagaatt ttataatgaa ttaaaactaa 
aagggtgcag taaatggtgg cttttctatg 
attggataag aatgtcttgc tagttaaggg 
gagtgtccac caaagatatg accaagaatg 
actacattgc agttttatct tgtcctattt 
tgagaatctt atcaacagcc aaattaatta 
ttcttcccat gagagaggtt ccacaaaaaa 
tatatgtcag tgtcctcaaa catcagagga 
ttattgacta gagtcaccat agataaccat 
aataaaataa aataaaataa aataaaacaa 
acaaggggca agtaggatgg gtcagaaagt 
tgaattttga ccaatgaaac ctacaagatg 
tctcattttc atatgaaaac tatcacacac 
agagagagag agagagagag agagagagag 
aaaatccaaa agaaaagaat gttgaatatt 
aatgtgaaca ctagataaaa tacaaacacg 
ttatatggag attgtatagg tcaatgatta 
atgaataaat cgttaataaa caatttagga 
ctgtatttga gacttaagct tggaaccata 
tcaagctaga aaaaatagga acatgctaga 
tcttaaagac gtctctaaaa aaaaaaaagc 
gtgtacattt ggtcttagtt ttcttaggtt 
attgtgttga gtgagagtaa ctgtagacaa 
gtcttttgaa ggagcccggg gagtggcaaa 
taatccaggg cacattttag aatatcttac 
atctttgcag agatattttc ctaaccagca 
cattcctcat aaaaagaagc ttatatttat 
tttacttgta ttgatttgaa aactttaaca 
acaacattta aaaagttgta aatgtatatt 
tgaatttaag aataattact ttcctggaag 
gaaactttca gtaaaatggg ctttggaagc 
atatgttcag tttaggtgag taccatagtc 
accttgagct tgaattatag agaattacat 
gtttttctat ttactagtat tatcatgtct 
cagtttttgt tgtctaacaa tcctcatagc 
agttaggaaa gatgtgaccc gatcatattg 
tcctcaaata atgtattcag cactggatat 
attgattttc ttattttttt atgtgtatga 
aacacatgtg tacctgagga aaccagagag 
gatggttgtg agcattcatg tgagctctgg 
ggagtgctcc taacactgag ctatctcccc 
aagtatctgt tttctggatg ccaaacagac 
ctccttaggt cctccctcct ctttccttca 
ctgtgcatta tctctatagt gtctgcctct 
attaatcata agtcttgagc ctgcatatta 
cttggtttaa ttacttgttt tcccccagga 
ggagtgtatg tgtgtgtgtc tgtctgtgtg 
aggttttaag atatttccct taaaccctga 
ctaaagaatt acaatgtacc atcatttctg 
tataaagcca ggtataatct gtgatccttt 
ttctttgaag gctttttaca aactggttga 
tcccttcagg acgttatatt gtttggtttt 
gatacttggg gaaattagaa acaaatgaga 
tattctcata agtgtaacca cacaacagag 
aatcacaaag caaatgcctc tccatcaatg 
tatctagtac aaaatagaaa ataaaataaa 
ctacagtgta tgtgtctatc atcatacttt 



gttacaaatg ggctttacaa atttagtgat 9420 
atacagccag tcttaactgc caacatatac 9480 
ggtagagctt agaagtaagg ttcattttta 9540 
atgaagcctg ggaagacttc tgtgagtgaa 9600 
gttcaagtag aaaattatct tatgagtctg 9660 
ttcagtgtcc cagactatta aacaaaccat 9720 
agaaaacaga atcattttga acccccaaat 9780 
gagacctagg caaggtataa tattactgca 9840 
gactgcaaaa aataaaataa aataaaataa 9900 
aacaaaataa aataaaataa aataaaagct 9960 
aaatgccctt tgctgccaag taccacaaac 10020 
gaaagacaaa ctgcctccta caaattgtct 10080 
acatacacac agagagagaa agagagagag 10140 
agagagagag agagagagag accacccttt 10200 
tctcaaaagc aagatagcta tatatacctt 102 60 
ttgattgaaa tactactttg tatgctataa 10320 
aaataaattg tggggaaagt aaaaagggaa 10380 
agacgaaaaa ttttctagtt ccctagcatc 10440 
tgaccccttg atctgctctt caatagtgtg 10500 
atttctgtgt agcaagcccc tgattcaggg 10560 
tgatttgatt tatttaggaa taagcatatt 10620 
ctgtttcatt ataattgatg aaattcattc 10680 
agataaaggt gagacagcag tgtgcatatg 10740 
acagatgaga tccctctgat ccttcggttc 10800 
accgttccct gccctatgcc ttgacttctt 10860 
aaatggagtg attgagctac ctgtgtgaaa 1092 0 
ttttgttatt tgttgttttt aatctattca 10980 
atcccaggga gcaaggaaag tattagatgc 11040 
gagtaatagt aagatttcct actgtctcgt 11100 
aagcaattcc cccaccctcc ccaccccctg 11160 
atcatagtca tggacacaaa gatttattta 1122 0 
tttcaacaca atcttggaac caggaccatg 11280 
atccatattt agcagatagt caacgttttt 11340 
tgaaacaacc tttgttctgt ctctcaccct 11400 
tctctctgat aatgaaccta aactttatac 11460 
ttatatttct gatgtgactt tgaaaagagg 11520 
gaatgatttg tcagtgtgca cattttttaa 11580 
gtgcttggct gcatatatgt atgtaagtat 11640 
aatatcaaga cccctggaac tggagttgca 11700 
gcactgagcc tgggtcctct tcaagtgaaa 117 60 
agctctctac tttgcaagtt attattttta 11820 
cttttagtaa gagctatagg taaagacaaa 11880 
aggcccactg agaatttcat tattaatcat 11940 
ttattaatca cctccacgga atctatcgct 12000 
ccggtaatta tctcacaatt ttcgttacct 12060 
atacaaacta ttttaagccc ttgactctga 12120 
tccgtgtatg tatgtgtgtg tatctgggac 12180 
ttatcagtgc atttagtaaa attatttaag 12240 
aaagcttaaa gatccttttt catatgaaga 12300 
cataatttac tgttatgtct tcttcaataa 12360 
tttagtttct ccaggaataa gcacactggg 12420 
ttattttttt tcttttactt taattcagtc 12480 
ccaaaattca gaatcagtgt gatgaattct 12540 
gccttgataa tctcagtttg atgcaaattt 12600 
ttattttatt tgcaaatgac agccactgta 12660 
tgtccagtct cctttgaaga agatatctta 12720 
cagaaatatc attttgagaa aaccaatagt 12780 
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ctcgaaagga agaaagctat ttttctaata 
agtagcttat atgtgggtcc cactaattca 
ttacagttag aattctagtc taaatttgcc 
tatattaaag caagcacata aattgtacca 
agaacttaaa tgtcattgtc ctaaactagg 
aagatctatg gataccgtgg atgtgaacaa 
tgtaaagccc tctttttttt caggtaccac 
atggatgtgt tcttctgtcc acactcctta 
agtttctaaa tgtgcttaat gaagaaggaa 
ctagcatacc tctaagttgt ggaagatgct 
gttgataagg tgagtgtaca cttccatata 
ggttgtactt taggagagag agaaagcatg 
tttgagatat aataactata ctttagggtt 
tatatgcatt ctcttagttg actgcctacc 
ctatcctata aggaaaaata agtgagaggg 
ttaaatggaa aggtaaagtt acaagtatga 
tgctactaca caataatgga ggttatagaa 
aagttgaatc aatgcagagc ccaggataat 
ttgagacagg gtttctctgt ttagccctgg 
tggcctcgaa ctcagaaatc cacctgtctc 
cgccatcacg cccagcagta attgaaagat 
tgtagtattc atcaagataa gatataattt 
tgaataatct tagatcaaag actacatttc 
tgtttaagga tagaagactc tatttctacc 
atgtgttaaa atttgtcttc caaagactca 
cctggatcat aatgtctaca ttgtaatatt 
cctttcaaat ttctatgaaa acaatttcaa 
aatctattta aaactcacag tcaaatacaa 
gtcaatggca gcatgcattg tgctatgtgc 
cctgttattg ggcagtcttt tcttttcttt 
tcttttttct tttcttttta atattgcctg 
tcctccatgt atttttgtag ggtgggggat 
ataaaaatgg acagggaata tccttccaaa 
tttcacatca cagttgagaa tgctgaatat 
actttctcaa ttgtaaaaat gtatttatac 
tctgggaaga gacactatga ccacggcagg 
ggctggtgta caggttcaga ggttcagtcc 
gcagtcagac atggtgctgg aaaaggaact 
gagaagacag actttctagc agctagaagg 
tcctccacca aggccacacc tacttctaca 
gggacaagta ttctcaaact accactagaa 
gttaatttgg taaggagatg aaaataaatg 
ctaaatgcat gctgttcaaa tgactcagca 
tgagaactca gtccctaaag cagatctctg 
atgtgcataa cctcctaaat atgtaaatag 
caaatggaca aatagcttgg atcatgtgat 
gaagaacttg tattacaagt gagacaggga 
tatcttgctt ctactatttt tggtatcttc 
tatgaaatgg caaacaatgt ctaggaggga 
ttattattta cactacctac atccatcagt 
acagtccact catctataac ttgaaggaaa 
tttcatatct aataaacttc ctatgtgtta 
ttctggacaa tctgataaag aaaatacttg 
tagcctaagt taattccttt ggcaaataat 
taatttctga aacttatttt tggggggttg 
ttctatattt ttcaagtgga ataaaaattt 
atttacttct acattcaacc aaaaattcat 



tcacacaccc ctgattccat tttcctccat 12840 
ggaagcttca ctaaggattc taccgatgat 12900 
tgacatcaaa gcctgtctac tctactgggt 12960 
cttaatatac acatgtaaga aatgaaaggt 13020 
gatgcttgag acacttgcag ttgagttatt 13080 
tatatagatt agtatattta tgccagcaaa 13140 
caatgtgggc aggggtgggg gagtaaacac 13200 
ttgacttctt accatgtgtc ttgagataac 13260 
gacattttac tgatggatgc ataagatcac 13320 
tctcagcatt attgaatcca ttttgtcagg 13380 
atcattttta tttatacagt ggcatttcag 13440 
atatgattca ttaaagacct tataacttat 13500 
acatgtaaca aacaattcta agcaagtttc 13560 
agctctatga aatgacaact gttactactg 13620 
agtttaattt gagcaaagac aatggtttgg 13680 
aatgtgaaga tttaaataaa agtgattcaa 13740 
attaattata gtattatgta ggtaaagaga 13 800 
tgaaagtttt tttttttttt tttttttttt 13860 
ctgtcctgga actcactttg ttgaccaggc 13920 
tgcctcctga gtgctgggat taaaggtgtg 13980 
ttaaaatttt cttttgtaca ggtatctaaa 14040 
gtcaacctgg ggccaaatta agttgttctg 14100 
atccatttcc tcagaaatgt gctttgagta 14160 
catggggtta taaaacacac caagaactac 1422 0 
tgtcattaat tttaattaat ttacttttag 14280 
cattttcatt ggctctttag ttgatgtgta 14340 
gaagattcag tgaggatcta ttatctgctc 14400 
cataagggaa caggactcca cttgggacag 14460 
cttacatgag agctaacatc aaagctctgt 14520 
tcttttcttt tcttttcttt tcttttcttt 14580 
gattgtttgt cttgtgttcc attccattgt 14640 
gatagttaat ttgacaaata agccactatg 14700 
gtaattttta cagtggagca gctatttaat 14760 
taattccttt gagttcataa atctgaaagc 14820 
aagagaagtg tcttagttag ggtttccatt 14880 
caactcttat aatggcaaat atgtaattgg 14940 
attatcatca agcaggaagc gtggccacat 15000 
gagatttcta tatctttttc caaaggcaat 15060 
atctcaaagg tcaccccaaa gtgacatatt 15120 
aggccacacc tgctaatagt accactccct 15180 
gtattgagaa ttacatgtat attgtaagta 15240 
aaactttaaa aaaaaaaaaa aagagttcct 15300 
aattttggta cttgctgcca agactgaaga 15360 
aatcccgtat gtgtatacag caaggtatgc 15420 
atgacactga tattatcaaa taccaatagc 15480 
gctgataaat gagataatta gaaggactgt 15540 
accattcaag actcttgata atggggctag 15600 
tagataccag tggctagaat gcatccacca 15660 
gatttataca gtgtcagtta ctggtcaata 15720 
ggtttctata tagaaacaga aattacattt 15780 
gaaaaaggga taatatgaaa atgatagtac 15840 
gcctctagtc taggtgattt gtgtattctg 15900 
ttatccttga ttatagatga catatataat 15960 
atagaagaaa taaaaaaatc tcaagtattc 16020 
gcatttctcc tccatcattt tttcattctt 16080 
tcatatgaat tttataggtc tcaccataat 16140 
ttctcaagaa ttaaataata tgttttaact 162 00 
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agattccaga ggaaaacatt gtctcgagca 
ttcttcttct tcttcttctt cttcttcttc 
ttcttcttct tctcctcctt ctcccccttc 
tccccctcct cctcatcttc ctcctcctcc 
gaaatatatt cttacttcta aacaagaaaa 
tctcagaagg ccaggtttca ggtgtaatga 
aggtatcttc tcttcaggat gctaaggatt 
gctgctggca aattgtttag aaatctggcc 
agtgatgcat tctgtgatct gggcaaggtc 
agatattgaa ttttcaatcg ccttgtaaaa 
gttattgtaa ttttatgtgt ttgcacttga 
tgaagatttt gttttgtgca agcaagcaat 
caccttatta gtgtcttgtg cttttgactt 
ctcttatgga tttacaccag gctttcccac 
gtgaggctaa tctaattatc taggcttaac 
tgttatgtcc agcattctct acttcttaaa 
ttcaagttga tagtgtaatt gaagaaagaa 
gttgtgttac tttatagaag cagtcatttt 
atggttttgt tcttaaatgt aaaatttctg 
tcccttgttc agcatgctag aagctttgaa 
tttttaatga ggagctttta tatctcaagt 
tctcatccta ttgcctacac tttctgaata 
tttgtagcat tcagaataga aaccaagtct 
tataccctta gattacaggt gaaataatta 
ctacacaagc caagaaacta gatttagctt 
taaggtcctg cttttctttg tttctaattt 
atctcatggt tacaatgatt cattcaatag 
agtacaatat tgtgccagtt ggcttttggg 
tttggattat agattgtaac tttattatga 
ttctagaata tcaattgaaa ttagagatca 
aaatgataaa ctcactaaaa gacacagatg 
atagaaaacg gaaacaagtc tttttaagta 
gctaatcctt taaggagaga aagggaaaat 
tatcaccttt tgaatacaca gaataagaac 
actatttgat gaagtgactc catgagaagg 
atagctcaag ttactgttat attgagttat 
ttatttccca tatttttgtg tgttttctaa 
tacctcctct ctttcactca ctatatatat 
atgcatatac gtatttttat atatgcatat 
tctcttgaac ttgtattctc atttacaata 
aacctactgg taaaaacgat ttgtatctga 
tgatccacag ataaatttct cagagaaaat 
tatttatgtt atctagacag cccagagggc 
caatccactg aaactcagtg ataatcatag 
tgtagtcacc cagttctccc acattggtga 
aagaacatag gtttctgcct agccctactc 
tgattttcag agatgcctcc tggaatacgt 
taatatttga tttcttgatg attttacttt 
tttgttttgt ttgtttgttt gttttgtttt 
gccctggctg tcctggaact cactttgtag 
ctgcctctgc ctctcaagtg ctgggattaa 
cctcgtcttt aaatagttca gttcagtaaa 
ttgacccaaa accctctctt tctcatctct 
gaagtatgct ctaggtttgt gcaggatgga 
ccctaaatat gtaagtaaag tcttctcaga 
actttaatta atctcaagca gtttgcttcc 
tctctccctc cacaccttat aattgctcaa 



tatgtggttg tcttcttctt cttcttcttc 16260 
ttcttcttct tcttcttctt cttcttcttc 16320 
tccctctccc cctccccctc ttcctcctcc 16380 
tctttcttct ccttctcttc ctggtcctta 16440 
aaaatgatga acaactctag attaattttt 16500 
gtatacattc ctagttctcc ccctcctaag 16560 
aatatatatt attggcattt ggcaaagatg 16620 
tattttagag ttacttcata taaaatcagg 16680 
cacagggtcc aagatttaca ttgtataatt 16740 
cttggaatgt tttttgttgt tgagtcattt 16800 
gctgatggct tctgagaacc tcttcttaaa 16860 
tgaattacct ctttcctaaa attattcagt 1692 0 
acattgtcta tttaattgaa atgttaggtt 16980 
aaacctgcag agcagcagca tctttttgag 17040 
aatctggagg cagagaattt ctgaatgaga 17100 
aataaacatt tctaagtaat ggaaaatttg 17160 
aagaaaattt tctgtttgga agctacagtg 17220 
ctctttgtac aatattttta attaattaaa 17280 
ggaatttgtg attttacatt tatcacaaca 17340 
cattccatta tggatgtttt tattttttat 17400 
tcagtatgta tctgaaaatg gccttgaact 17460 
atggggtgac aaaggttgcc aaacctgctt 17520 
ttgtgcaggc caattctcta caatctgagc 17580 
aagtagaaat aatggtatta tgcttgagat 17640 
tctggttctt attcctttct tctccaagtt 17700 
gatggtctag ttgttgttct aattttcttt 17760 
cac teat tec tatgaaaaaa caagactgtg 17820 
taagaaaata tttaaattta tatatgctta 17880 
caaagagaag agaaatgcct tggactggta 17940 
gaaaggtaag aatgtctgca tgaaataaat 18000 
aattaatgga ggaaatgaaa aagagagaga 18060 
tatatgactt ttacagaaga gtgaatgtga 1812 0 
taattgtttg tctgtctctc taatccttag 18180 
aaagaaacaa attatgtcag aaaacaagtg 18240 
tcaatatttt acgttcaagg tctttttgac 18300 
tgttatattg agttatagtc attttgaaat 183 60 
ctttgtgctc aattttcttc tcaatttata 18420 
gtaaatatat atgcatatat gtaaatatat 18480 
ataggtacgt atgtgagcat ttaatagtac 18540 
ttgtgagtac tagtttcaca atttgatatt 18600 
gttcaactat tctgctatgg tgatgtttgt 18660 
aatgaaaagt gctttatatt cacaaataga 1872 0 
acatggctaa tgatgaaaat ataatcaaga 18780 
gagtttatag cacctgacac aagatagtca 18840 
gacatacgga aacactggat aggtgaggtt 18900 
tttaatttca ataatgatgt tgatagtgag 18960 
tctatgtaca ctatttttct ctttgattat 1902 0 
gtacaccctc atcatctttt tgtttgtttg 19080 
tgttttttcg agacagggtt tctctgtata 19140 
accaggctgg cctcgaactc agaaatccac 19200 
aggcatgtac caccatgcct ggcaatacag 19260 
aaaaaaaaaa aacaacatag cattctgtct 19320 
ctacttgtaa tctatttgta ttactgtgta 19380 
tttgtgtcag ctgcagtttt catgactatc 19440 
taaagtcact tttttagtgg gaaaaatcat 19500 
cacggatcac aaagaaatag tatagatatt 19560 
aaatgaaggc aagtttgttc tggatgctaa 19620 
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atatgagtct 
tatccctgcc 
actgagcaat 
ttcaaactcc 
catagggttt 
gcagaattaa 
ctctggagca 
gggtgtgggg 
gagcattgtt 
taaatacttt 
cccagtcttt 
actcattcat 
atttccttga 
tgatgtgtaa 
atatgccctc 
caggagaaat 
gtttaataca 
gaactaagga 
atgagacatc 
attcctccaa 
agctgttatc 
acagatattt 
acagtgttag 
cacagcatac 
taaggacaca 
acttcaagag 
agaactgaga 
gttttgctct 
ccctttctct 
atggcagtgg 
taaaaaacaa 
tgattgtaag 
actgcaatgc 
aggaatcagt 
actgtggtta 
aaaattgtgt 
tatcattggc 
gtcgctctgt 
gtgtgtgtgt 
agatgggagg 
ctagagccac 
tgggtgtgtt 
gtagtaacac 
caataatgat 
tgtcaccata 
ctaaatggtc 
actcttcctt 
tagaaaacta 
atgatggtag 
aggtttagca 
gtcttctaga 
tattgttatc 
tcctatcata 
tcctgagaca 
ttcttccctt 
attctgtcct 
taaaaatttt 



cttgtttcca 
tcccgtgtaa 
ctgtggccat 
tctgtcagca 
gtaagtggac 
gggtttgtga 
gctgggccca 
gtgtgtgtgg 
tttattttct 
ccagtgtttt 
tcttttagag 
ggaatttagc 
aaaattcatg 
cagtatgacg 
agtcactttc 
ataatgggca 
aagtgtctta 
gtagcagaca 
aaccaattac 
ggtatcaagc 
ttggaggatc 
gttgtgtaaa 
tgaattcaaa 
aaatattgaa 
agataatatc 
accgaaatac 
acaacacttt 
tgccatagtg 
attctctatg 
cccccaggaa 
cttattgttc 
tcatactttt 
gtcatcagtc 
taaagttcta 
tttattctca 
tatttacact 
attgggtagt 
cgctctctct 
gtgtgtgtgt 
taaaaaggtt 
agagggagag 
tccctcctcc 
aattaaataa 
cttactgacc 
gaaattacat 
agtatctgac 
tgactaagtg 
aatggcttcc 
ctaagtatca 
aataaaacat 
agaggctgac 
atttggtttt 
ggtatgaaag 
caatttagaa 
tggtagcaag 
taattatcac 
ttattaggta 



caagaatgaa 
agagtgatgg 
atgctgcccc 
cattcaaagg 
agtcctctag 
ctgataaaag 
cagtgtcatg 
agaaaaacat 
cttagctatg 
gtgtatattt 
attcccatat 
agaatttgat 
tcattacaca 
tgcaagtcta 
ttaaaaaggg 
gtattcacgg 
agcacttatg 
caagctgtgc 
tattgtttta 
actgtaatgg 
tgaaagtcta 
tgttaaggta 
ttaagtgtta 
tcacggcata 
atgctttaaa 
gtatcaggaa 
gaatggaatg 
atatgctctt 
accacaatac 
agcacatttt 
tctatagaca 
atttaccaca 
agaaagccca 
atatatgaca 
cagtctctta 
tcagtgatgg 
ggattctgca 
ctctctttct 
gtgtgtgtgt 
aggaaatacc 
atgcacatca 
tttcagatat 
aaagactctg 
acttattgag 
taatttaaca 
aactttgacg 
attgtaggtc 
tattcagtct 
attttttact 
tactgcttac 
agaactttaa 
cagaaaaaga 
ctctcaatga 
gagattattt 
aaaggacatg 
actagattat 
ttttcctcgt 



agaatgatcc 
aaggtgagcc 
ttggttttgc 
tgaagaatgt 
caggtgctct 
taaaacaaat 
tcctagtttc 
cgagaatatt 
tcacttttga 
ttgaaatttg 
tcttctagtg 
aaataagaca 
gtgaattatc 
gcacagtgtt 
gatgcgtagg 
caaggaacag 
ctgggcagac 
taacttatat 
taaagttatt 
gcatgcagta 
gtgggtagag 
aagtggatag 
gtgaattcaa 
gtaagtgata 
attgtaggag 
gtgatttcca 
aacgttttct 
aatcttctac 
cacaggcaag 
tctgtctgtt 
aataattctc 
aagaaaaaac 
ttgtgaactg 
agctgcagga 
atcaccatga 
aaataaatgt 
gtttatgaca 
ttcatatgtg 
gtgtgtgtgt 
catttataac 
gtggtgacag 
tgcagaaaac 
atcatgaatg 
caagaaatat 
ctggtcttat 
agatggtcat 
ttctgttgaa 
actctccatt 
cattgcaaaa 
ggcttctctc 
tggtctggtt 
aatacccaca 
agatgtatat 
gattctttct 
tgcatcttgg 
ttttcttttc 
ttacatttcc 



agtgtgcaga 
taaagaaact 
catatggctc 
agagacgaaa 
ccagctgggc 
gcctgagggg 
agagccccaa 
ctattgagtg 
acttagcaat 
aacatctgtg 
tgtatggagg 
atttactaat 
tggttgtgtg 
gcatcagact 
tcatgcaaaa 
ttgtaaagag 
acaactgaac 
attactgacc 
gccataaacg 
tgaagaggca 
aaaagttttc 
tacctaactg 
attcttagtg 
ggagattgga 
aaacactgag 
cataggccag 
tcattcacac 
ttcagacctt 
gtgaggaagg 
tagccagtgt 
aattgaatac 
taaaacccct 
atgtatgtta 
aacattctgt 
aatgggcaaa 
tatgttactc 
atctctctct 
tgcacaccct 
gtgtgtgtgt 
taatgaagtc 
agtaaaccta 
cccaaggcta 
actcctaact 
gtatcgtgtt 
gtggtgtact 
ttgtttctgg 
cctgctgcac 
gtaggataaa 
ccacatttgc 
ttctactttg 
aaggtcacca 
caaagcactc 
aaaatgtgtg 
ctgaggcttc 
gcgtggatgt 
ttttttttta 
aatgctagcc 



attccaatac 
gtagatcagc 
tgagtctaat 
gaaacaccac 
tggggcagca 
agaggagagg 
agtacccaag 
atcacaaaat 
gtagctttat 
catcattttt 
gaaagcagag 
gccctcatta 
ctattcacaa 
atttctaaga 
ttgagaaaaa 
cacccccctt 
attctgtctg 
aatgtataaa 
ttgctactga 
gtgcagattc 
ctaaaacagg 
gggaggctgc 
tagggacttc 
aatgagagca 
gccggtgctt 
tgaattatgt 
cagggattca 
ctttgccttt 
agactagctt 
tttcactttt 
agcatgttac 
gtcacttata 
gtagattgga 
accagactgt 
tacaggctgt 
atttatagta 
cgctcgctct 
Gtgtgtgtgt 
acttcaagtg 
ttaggacagc 
gttacaaata 
tgtatcaaat 
tgtttgcaac 
atgtgtgtta 
taacttttta 
ctaagatggg 
aataataatg 
aactgacatc 
atgtttattg 
tacttggttt 
catgctagtg 
tcctgaatat 
catcaatacc 
tttttacctg 
acttctcagt 
tttttctttt 
caaaagtccc 



19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
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ccatacccac ccacccccac tcccctaccc 
ccttgtactg gggcatataa tgtttgcaag 
gactgggcca tcttttgata catatgcagc 
ttcataatgt tgttctacct atagggttgt 
tagctcctcc attggggacc ctgtgatcca 
gtgtttgcta ggccccggaa tagtctcaca 
aaggctggcg atctaagcac tattactatt 
ttacagacat tgattggagg atgagggggg 
gtctagattc tttaagtcta ttttacaatg 
tcagtgagga atttcatttt agaattattt 
tttctaaatg tgtagtgaga taaatggatg 
ttgattttgt cctttaattc attgattgct 
acttagattt atgtatctgc atgtgaagga 
tgtactttat tccctaggaa gagggtctct 
gaagctccac agagcccctg gaaaggagta 
atctagactt ttgcagcttt gttgtagcta 
tgggggtatc gactggacta gtgatgttta 
ttgcttttcc tcaggcatgg ctctaacagc 
atccttatat tgcccacatt ttatttttat 
atgaggatct gtgctatgat taatttaatt 
atttaaagat ttttcttttt attttcattc 
gactatcaca tgcagtgtcc atgcgagtca 
gttacagatg gttgtgggat agcatatggg 
gagcagaaat gactcttaat tgatgagcta 
gtagcaaatg gagaactggc ttgtatagct 
tttctcttca gaggcagatg gatctttgaa 
tggaagtgga gactgtacat aattttgcag 
gatttgggga tagaagtttc ataatgcagg 
gaggttgctg gtattttcat cttcttatca 
gcagtccaca gatgatccaa gcagaatcac 
ttagtattta gatatgtcaa ataatgtaaa 
tctcaggttt acagggctga aaatgaggct 
ctggtgttta gtcatctgac ctgaccttta 
tcttcactag cagtgtcaca tgtagaatta 
cctgtttctg tgactgcaat ttaaaaattc 
tttaatgaga taattaaagt aaactaatgt 
gattatttta aaaaaagagg tcaggggcca 
aattcatgaa gacctgcaga tcctcagtaa 
aataaaaagc aaacatcgta aaaaaacaac 
atctataatt ccagcactgg gaaaaggcta 
gatcagtcag tatagccaag gaatcagtac 
caatggtgga gcctcttgag tttctcccac 
acgttctcct tttaataata aacactatga 
aataaaagaa gattgagaac tgagaactgc 
tacacacaca cacacacaca cacacacaca 
cacacacgaa atatacatcc cccccgtgaa 
aaagcatcat gacacaagac acggcaactg 
agcattgtga gaaaatatgt tcctctagtc 
gggagacttg ggagtctaac tgaagcagta 
gaggatgttg ctccacgcca gttccctgcc 
tgggatgaaa gccagcttcc tcttgctgtt 
tgttctagtc ttcagtcttc taacttccct 
aaaccaaaag tgagaatgct aaggttggca 
cagaggggca cccactggga gggaagaggc 
gttacagatc gtcatgttct ctcatggcct 
cctccctggc tcctggctag gactctgatc 
ggaagcttga atccctgcat agaggtatgt 



actcattccc cctttttggc cctggtgttc 23100 
tccaatgggc ctctctttcc agtgatggcc 23160 
tagagacaag agctctgggg tactggttag 23220 
agatcccttt agctccttgg gtactttctc 23280 
tccaatagct gactgtgagc atccacttct 23340 
agagacagct atatctggtc cttctcaggg 23400 
gcagcaaaga catactctac ttggtatgca 23460 
ttaggaaagt taagatttca gaagatgaca 23520 
tttttctcta gcctaggcca agagacatag 23580 
tacatttgaa gtttctagaa tttggcacaa 23 640 
aggaagggat taactttaaa aagctagatt 23700 
tgtttgtgtc tgtcatatcc ccatgtatgt 23760 
taggaggatt tcggtgtctt actgtgactt 2 3 820 
tactgaactt gtatgtagac ttgtggccaa 23880 
gctgagagaa ttctaacctg attgatggtg 23940 
aaatacattt gaggttctta tgacacacct 2 4000 
tccttctatt catcagaaac ttatatgaac 24060 
tttacaacta ctctttgagg aagtatgatt 24120 
aattgccata gttgtctttt atgggatata 24180 
caaccacaca agatagataa tcttctattt 24240 
atgtatgagt gtttacctac atatttgtat 24300 
gaggagagaa atagattccc tggaattaga 24360 
tgctgggaag caaacccctt tctttcagaa 2442 0 
tcttcccaac tctatacctt cattctcata 24480 
tgactgctgt catgcatctt tttttttttt 24540 
tcagaacaat gaagggaccc agtctctcca 24600 
ggggcttggg ttttatatgg tgaaaagggg 24660 
tcagttctcc tgaagtctca gtggaggttg 24720 
gaagcttccc tgggaagcta ccacatgcca 24780 
atagccttct aagtgtatgt attctaaata 24840 
tatgtaaaga aggagggagg taaaaactgt 24900 
caggaaataa aatcatttgg acaaggtgat 24960 
cttcagcaac ttctgattcc cttcactact 25020 
tgtactgttc cctaaaattc ataggctgtg 25080 
atctcccagt gccatgtcct atgacttgaa 25140 
cttatgggtc tgccttaata caatataact 25200 
gggagatatc tcagttgata aaatgtttca 25260 
cagcatttaa aaaaatgaaa ttaataaacc 25320 
atcacaaaca acaaaaaccc gaatgctgat 25380 
gctacaggtg ggagatctca aaacttaact 25440 
caggttcagt tagagacctc ggctccaaaa 25500 
agctcacgag cctgctccta tctttcctga 25560 
tcctgtttcc aataataaat agtaattaat 25620 
agaaggcact caatagtgaa ctctggcttt 25680 
cacacacaca cacacacaca cacacacaca 25740 
cgaatgaaca cgtacacaca taggtaaaag 25800 
atgatatctt catcctgggt tttaatctct 25860 
tgaaacatcc agtccctaat actgtgctct 25920 
agcatcctct gttgaaaata aagaaggaat 25980 
ttcaccaagc ccagaggtca gatgacttcc 26040 
cctccagtcg gtcagcaaac gccttcttcc 26100 
cctgcgacgg ggcagatcga ttctagaaca 26160 
ctctcacttc ctctttgaat atagtacttg 26220 
aggtgtccca gggactctgc gctgccacca 26280 
ccactggttg cagaaaatgc caggatgatg 26340 
atggcactgt tcttctcctg cctgacacca 26400 
gtcttgatcg catgtgatca caccctttcc 26460 
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tgctagcctg ccttgtttct caaaactatc 
cttagtttat tttgcacgaa ggagttaaac 
aaaagcctca gtgttaacac agggcaggaa 
aggcgtctga gtcatgattt agggacttga 
ctgtgacata cctgagagtc gttaaagaac 
gggacgaaag catgactgta atcactgaaa 
ctgtgggaac tctggccact cagcttttag 
aaggtagctg tgtcttttct gctgctttcg 
tcagcaggta aggtcagtgg ctgccaagcc 
tcacaagtta gaggcaggga atctcctcct 
ccttggcagg tgtgtatatg catgagcaca 
taaatgaaag aatttctatt aaaaaataaa 
ttctaagtga ttttagtatt ctgcaactaa 
gggaaggaga gggacagagc aacttaaaac 
ccgttttggt cgaatactga gtcgtgagta 
cctgcacaga ccacgcaatt atcttagaag 
aattcggggt gaggtaggct cagaagctca 
attcacatgg ggaaaaatac tgttaatttt 
tgattcctca gttatgttgg cagagatgga 
tcccgttgct cttggtgtct ggtgtgcttt 
gtaagccagg tctcagggcc cttggccttg 
agttcatatt cctttctata cccacacaca 
ggggcttaga tttatcaaac tactccttta 
gttgcttcat cctacagggc cttagcactg 
cacctaggtc aaggctcacc atgctaaaat 
cactgttgat caaattgaaa aattaagaat 
atgcaattaa atatagttat gatgcgtgaa 
gctaagcact agcctcaggt ctgtctccag 
tttcctttag ccagggtctg tctcacccat 
tgctcttttc accaggctcc ataagttacc 
tttccaaatt cctgcagctg gctggggtgt 
tacatttcaa atgccaccct gaaagttccc 
cacccagtcc cacttcttgg ccctggcgtt 
ctctcttccc agtgatggct gattaggcca 
agctctgggg gtactggtta gttcattttg 
accaccatac ttttctctgg agcccagttg 
acttggtcta caatcaacag gtaacacaat 
tttttattag atattttctt aatttacatt 
ctcccctctg ccctgctccc caacctaccc 
gtactgtttt tgtaaactaa tctatgttaa 
tgtggagact tgaggaccca ggatagggga 
gagggtgagg gagcaccctc atagaggtag 
cttgtggagg gaaaaccggg aagggggata 
aaaaaacaaa caaacaaaat cctcaggtgg 
acagcctccg actatctgca atgtgcctct 
ccttactcct gcctccctct tcctcttcct 
ctagtgattg gtttcctgta atgtttatta 
tacgaagata ccttatgttc aatttttgat 
gttttactgg acattttgat ttctcctatg 
ttatagctgc agtactctaa tgtggagctt 
tagatttctc tgtgaatgtc tcctcaggtt 
cccctttacc ctctccatat gctctttcat 
atttttccac catttattct cccctttgtg 
gggtttgtta gcatgtcacc aatgcctctc 
ttcctcaggt tgtggtgacc ttcacccata 
aattttgtta tggtgttgaa cgataatgta 
gatcctgtaa atggattgtt tgacccttaa 



cacagctcag agctccctgt gtgtgctctg 2652 0 
taaccaaaaa cttgagaagc cttggcaaca 26580 
caggcagcca ggggtgtctt gtttcattta 26640 
aattagtaaa actagtttat agtcattgtt 26700 
ttactgaacg tctctgaggc cagtattcac 2 6760 
aatgtaagta ggctgtaatt tcagggcttt 2 6820 
cggtcattcc ttccctttcc aaatcaagtg 26880 
aagcatcttt gagatgcttt gagtggtagc 26940 
tgatgaaaat ctgagttcaa gcctcaagcc 27000 
ttaagatgtc ttctcacttg caagtgtctg 27060 
cacacaaatg aataaaggga acaattgtct 2712 0 
acaacaaaac acacaaaaac acaaagactt 27180 
ttctaggaga taaagaaatg ggaggggtga 27240 
atcaattagt tactgctaag gcagtaactc 273 00 
atctgaccca tgactcattc ttgttttcct 27360 
ctcacaatag aactgagcaa acaaggaagg 27420 
aaactggttc aatgagttaa gatacatgac 27480 
aaaaagttat aatcacagta tcttgctttc 27540 
atttccaatc agtgctacac tgagataaaa 27600 
gtcaactctc aaagcttgct tgttccttct 27660 
tcttcaggag tgattcctga ctggtttcct 27720 
gtttcttctt tatttgttgt tattggtcca 27780 
tactcttaat aactctttgg aaccatgatg 27 840 
cctaagctaa ctacacacac catcatccct 27900 
tatggaatcc ctgtatatag tttaaaactt 27960 
aaatgcatca aattagtttc aatgattttt 28020 
atataataaa agcatcccac actaacactg 28080 
ccctatggac aggccgagga gaacatgttc 28140 
gcctgctctg tgtctccaga gctctgaaat 282 00 
atggctggct gatgccaagc acgccccaca 28260 
actttttttt tattagatat tttctttata 28320 
tataccctcc ccccaccctg ctcccctatc 28380 
tccctgtact ggagcataaa aagtttgggc 28440 
tcttctgcta catatgcagc tagagatacg 28500 
gctggggtgt actcttgcac accacactct 28560 
agttgccatg tgaaggaaaa cacaacacac 28620 
gttgggtgca gaacctagca tcctaatttt 28680 
tcaaatgcta tcctcacagc cccctatacc 28740 
actcctgctt cctggctctg ccattcccct 28800 
aaatcctccg actcaggagc ctcttgttct 28860 
acactaggct gttaaggcag gagtgggtgt 28920 
gggggtgggg gacggcgagg gggtaggggg 28980 
acatttgaaa tgtaaatgag taaaataacc 29040 
cagatcttgg aggatccacc acttgaattg 29100 
aatgctctca gccatccaca aagagacctt 29160 
cttcccgact cggaagtccc acctactcat 29220 
gggggasatc ctaccacata gttaagcaat 29280 
acaggaaatt agacattcag caacattttt 293 40 
cgtgtttcat atttcatagc tatgtgtggc 29400 
tgatttcagg attatctttt tcattttatg 29460 
gatttttctt gattgcctca tgtacatttt 29520 
tgatcatatc attttgtatg tttgtctttt 29580 
tagaataaac aagaagggag tattactgct 29640 
agtggttaac gctaagaccc tttagtacag 29700 
aaattccttt tgttgctact tcttaactat 29760 
actatcccct atgcaggata tgtgatatgt 29820 
s-tgggtcaaa gtccacaggt taagaaccac 298 80 
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tggcctagat 
atgacttctg 
ggctctgcca 
ccttttgaca 
atctagaatc 
ggttctatga 
cctttagcag 
atatttatca 
tttctgagcc 
caattgataa 
tcaaaatctg 
ctcttagata 
aagaagcctg 
ttctttgtct 
gctaatatgt 
ttaattaatg 
gcttttaaat 
tcatataatt 
aacagataaa 
taccactgga 
caatgaatct 
tatattttta 
caataatttt 
gcgaggttca 
actcagtatt 
gccttttaca 
acattaagaa 
ctcctctcag 
cactggactc 
tataaagtgg 
aaatggccaa 
ttatatatgt 
gcgggcaaca 
tggaaaagga 
tactctgtac 
ttatgtaaca 
taacatattg 
caaatatgaa 
atggcaggta 
ctggtctata 
ggacaatatg 
gagaggagtg 
ttcctaatat 
cttcttcaac 
gcttctccaa 
atgtgatgca 
caaggtggtt 
aaatccttgg 
aatatctaag 
aaccatgtct 
catactccta 
tgtccttatt 
tgtttataac 
aagagtgaat 
atatgtaaga 
aacttaggag 
ctaagattgt 



catgataggt 
aacaccatct 
gtttgtctac 
tacatatttt 
tgtgtctttg 
gtgcagtgcc 
aaaaagccta 
ttcattcatc 
ttgaagataa 
atgaaaagca 
taaaggaaac 
cattgccttc 
tttatgtatg 
ttagaggatt 
agcaaggaat 
caagctttga 
aagagaaaac 
atgatattac 
actagtggag 
atccatgcac 
aagaacaaaa 
ttggtactct 
acataaaagt 
cagaagatat 
ctttagttct 
tctcatgtgc 
tcacagtatc 
tgatccctga 
cagaattctg 
gagaaatagt 
ccctgaaaac 
gggattttat 
cggacttgaa 
aaagtgaagt 
caagtggcac 
cctgattcta 
agaattcagg 
gaggcagata 
ttttgggagt 
ggaaagcttg 
ttgagtggca 
ggaacacacg 
tacctaccaa 
caagaacata 
tttttcagaa 
gaatgtgaag 
ctaactaaac 
aaaagctact 
ggcaattcta 
tatatgttgt 
attattaagc 
attgaacata 
agtaagtaat 
gttttgttat 
tacctgaaat 
ggagatgtaa 
cctttctgat 



cttcagttgt 
gatgtcctcg 
atttgttccc 
ttcctttacc 
caactttcgt 
atcagaaatc 
ctatttaact 
caacaaatgt 
atagcaaaca 
tctggaaaat 
aaaagagggt 
aaggatccga 
aatcttctga 
ttttgaagat 
gaatagtaac 
gataagctct 
aaaactcact 
atgaaatgtt 
attatgatag 
tataagacag 
ataattgtca 
ccatgtagct 
atatacaagc 
tataaagtca 
ttgtgtagtg 
aggcagccat 
acagtaaagt 
cctgtaggtg 
cattttggtt 
ctttcccaag 
atatacataa 
ttatacaata 
aaacagcaaa 
gatataatta 
acaacacttg 
ttgtttctca 
ggatattttt 
aataaatgga 
ccaatgttat 
tttcaaccaa 
tgtacatgtg 
gttggaaaca 
tgcatggatc 
gatctgagct 
cttcagtggc 
tagttatttt 
ttccttctgt 
tttatttaac 
acctctacat 
atgtctaaac 
aggataaaat 
cagcacaaat 
aaaatatgtg 
gtgtgtgaga 
attgaaagta 
atgtccaagt 
tcagggcatc 



atgtgtagta 
tgttctgcct 
acttgttatt 
acacatttcc 
agttcttatt 
atgtaacatg 
tatacgggct 
ctattgagca 
aaaatcatca 
aactatatag 
gagagaagaa 
tgatgagtac 
ctatatgtgt 
tagtataaaa 
caatgataat 
gatctcattt 
gggttaagca 
atggtatagg 
agaaaacact 
cctctgatcc 
aaaaataagg 
gtatataatg 
atacattaac 
aaccagcaca 
ttgaagactc 
gtcagtgaaa 
tcgtaacctt 
ttggagttgt 
ggttgtgatt 
caatagcaca 
gtaatattat 
tacaatatat 
gacaagggag 
taatttcaaa 
ttatgaaatt 
tttaatcata 
tcttcctgat 
gaaggatggg 
ctttgactgt 
aataaatcat 
agagttttat 
taacaattgt 
agaaactcag 
tcaacccctt 
tggatttatc 
ttatatcatt 
catctattca 
tggtttcagt 
tatttaaaac 
tacagaagaa 
cctctttaac 
aaaataaatg 
aaaataaggg 
ggatcaggaa 
gaaaagagag 
aaaacatcaa 
ttaccatttg 



tgtgtgaaac 
gtggcttctc 
atttgcttat 
ttgatcagct 
catgttcttc 
tattcttgta 
ggtgtcccac 
ttgagaggtc 
gagcatcaat 
gcaagagatt 
tttctgtctg 
catttaggga 
attaccccac 
tacataagtt 
taacattaat 
agccctttga 
aagcattttg 
gttcacaata 
caaccctgag 
caggaccaaa 
cagaatctag 
aaaatgatga 
atggctttta 
caagcaaaac 
ctgcacatgt 
ctttatgggt 
tggactcata 
attgtaagtg 
tttttgtcgt 
gcaattagtt 
acaaactgaa 
atatatcaac 
taagaaaaaa 
taatagtaat 
aaggttttca 
attttgttgt 
atgtggaata 
tgtgatacca 
atagctaatt 
gaacgaatga 
caccccatta 
tgtgtggtat 
caaagtccct 
gaagatctta 
caggtaatga 
gcattcttgg 
gtagtgctac 
tggatgggcc 
aatttcatta 
gaatttatag 
aaataagtaa 
ttaactaatg 
caacacactg 
agattttgag 
tagagattga 
ctatgggcaa 
ttggaacata 



cagtgaaaga 
catgacagaa 
gttcttttct 
ttccttctga 
tctgttagct 
ccacccatgg 
caattacaca 
accatgtacc 
gctcatggtt 
taccttgtca 
atgccttact 
gatgtgtgtg 
ctcttttatt 
gtaagtaaat 
atttatcact 
gaattctatt 
ccagatgaaa 
aatgtgagaa 
tacaattttc 
ctgagaaagt 
gaaatgtctg 
attagaacaa 
catacaacta 
tttgtcccac 
gtagctgttg 
gtagcttttg 
atctttcgtc 
cttccattgg 
gatctctgtt 
accaaatgcc 
caggttctac 
aattaatgaa 
actttaagag 
aaaaaagatc 
gacttgagag 
agcagaatgt 
agatgtcttg 
tatccccaga 
taaggccaga 
atgaataggt 
ttcatctttg 
ttacaggtag 
gatgacattc 
aaaagctata 
atgagctttt 
cttagaaaac 
aacttgctgt 
actagataag 
gatatttatg 
atacaaaacc 
gttaaagtct 
ctaatactgt 
tgtcctatag 
acatgagtac 
aaaaaaaact 
gaaacagtta 
aaaactttta 



29940 
30000 
30060 
30120 
30180 
30240 
30300 
30360 
30420 
30480 
30540 
30600 
30660 
30720 
30780 
30840 
30900 
30960 
31020 
31080 
31140 
31200 
31260 
31320 
31380 
31440 
31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32880 
32940 
33000 
33060 
33120 
33180 
33240 
33300 
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gccagtattt 
ttcatacatc 
gatcactgtt 
ttattgtttg 
ctatctgaga 
cccaaaaatc 
aatatctatt 
gtctttgtca 
ggctcttaca 
ttgtgaatgt 
tacatctaag 
ggggtataaa 
aataaccaat 
tttttgaaca 
actcctaaaa 
gtaatagttc 
tcatgacacc 
catgtcagtg 
tcttctctac 
atttagagta 
actagaaaat 
ggaccacagc 
ttgcttgtaa 
ccctccatgg 
gtttgaaaac 
gatagatttg 
ttaatataga 
taagataatt 
ttagctcaca 
taaaaaaaaa 
cccctgaaaa 
tactggccct 
cctcccattg 
acaccatatg 
gttcatattg 
attttattca 
tatttgtcag 
gctcttgttg 
a-9"9tggcagt 
tcttttcatg 
ttccttcttc 
ggctaatatc 
tcaggatgat 
taatagctgc 
ttgagggaca 
tagtagagca 
gtattgctgg 
tccagagtgg 
tacatcttgc 
gaagtggaat 
tttttttagg 
tgtaccccat 
tagaatccta 
agtggaatga 
agagttaaca 
ttgttaaatt 
actccaaaga 



caggcgggaa 
tatgtaatgc 
agctctgacc 
acccactgat 
ctagtgaggt 
taaactttgg 
ggcatgatca 
tgtcccacat 
tttctatcag 
tcattacagg 
aatatcctca 
tgtaaatatt 
gccctctctg 
ggttaacagt 
tagttatgcc 
atctgggcca 
tttctgctga 
ggttgtactg 
catattccta 
tggtattaaa 
gtattatact 
tctgttggtg 
cttttcctga 
tgaaaagtta 
atgctatcat 
aagagaactg 
gaacatagtt 
ccttaaaggt 
ggatattttg 
gataattttc 
cccactatct 
ggcattctct 
atgaccaact 
ttttctttgg 
ttgttcttcc 
ctggggaccc 
gcactggcag 
atatacacaa 
ctctggatgg 
ggtattttgt 
ttgagtttca 
cagaattaag 
gccctccagg 
atagtactcc 
tctaggttct 
tgtgtcctta 
atcctcaggt 
ttatatcagt 
gagcatctgc 
atcagggttg 
tacttttcag 
ttttcaatat 
acagaaagtg 
tagaaactca 
tggctccatg 
actaaggaat 
aaataggaaa 



gctcaatata 
atacagctac 
ccttccaaaa 
tttaaccagg 
taaccatttg 
cagagaagaa 
gtgcagggaa 
gtccctcatg 
attcctcgtc 
ggtgatcaca 
ttcactactg 
tagacagaag 
caccccaaac 
actagccttg 
agaatttcag 
tagctggaag 
aagcaaatca 
atcaaggaga 
aagctactgg 
agtcttctgc 
ctttacagga 
gagcctgtgc 
gtctctgtct 
ataaatgaga 
aatcacttta 
aggtaagcag 
tttcacccag 
ttggagtttg 
cattagagca 
attatgtaat 
tcaccccctc 
tatgttgggg 
aggccattct 
ttagtggttt 
tagcactgca 
tgtgctccgt 
accctctcag 
tagtgcctca 
tcatgccttc 
tcccacttct 
tgtgtttttt 
tgcatatcat 
tccattcatt 
actgtgcaaa 
ttcaagcatc 
ttacaaggtg 
agtaccatgt 
ttacagttct 
tgtcacttga 
ttttgatttg 
tcattcagta 
acacaatcat 
gaaacttgag 
cagaaattaa 
gattcctctt 
aaccaaatca 
agtgagggaa 



ttttattggt 
tcttaccttc 
tgtctccaac 
ttctctgtgt 
ataagcaact 
atgattccat 
ccacagcttc 
tcccacaaat 
ctttataatc 
gaactatgtt 
tttactataa 
tctggtacta 
atcagggtca 
gacaaaaatc 
cactggacac 
agaccagtaa 
gcagagagaa 
tccttaggtg 
aatgtttcac 
agagcagaca 
gcatgataga 
tttctccaac 
tctctcctag 
gattattaaa 
ttaaattgta 
gtaaaaaact 
tataattgag 
tattcaggaa 
aagcagacaa 
ttcaaatgtt 
cccctgctca 
catagaactt 
ctgttacata 
agtcccaggg 
aaccccttca 
ccaatggatg 
gagacagcta 
atttgatggt 
agtctcttct 
aaaaaggatt 
gaattgtatc 
gtgtcttctt 
tgcctaagaa 
tgtaccatat 
tggctattat 
aagcatcatc 
ccaattttct 
gccagcaatg 
gtttttgatc 
catttccctg 
ttcctcagtt 
aatcatatat 
aaagaatcaa 
tgggttccca 
gcgatattct 
cagaccaaaa 
ggtaattggg 



taaaattgct 
acccacactg 
ctatattcat 
gaccatagat 
aaaaccagtg 
ggtcccctcc 
tatgacatca 
ccctcctctc 
cctgactctt 
ttgcttcttc 
agggaagtga 
tgctaattta 
taggcctctc 
taatccaaga 
atcttgcctg 
tgatttttcc 
cattggttgt 
ttgaagttga 
acatgtgttt 
atactgtaaa 
tggagaattc 
gtttagcacc 
taaaggaaaa 
attatttagt 
cattctactt 
ctaaacagaa 
aattgatcta 
aaaggtaagt 
tctactcctg 
gtcccttttc 
ccaacacacc 
tcacagcacc 
tgcagctaga 
agctctgggg 
gctccttggg 
gctgtgagca 
tatcaggctt 
tgtttatggg 
ccacactttg 
gaagtatgca 
ttgggtattc 
ttatgactgg 
tgtcatagat 
tttttgtatc 
aaataaaact 
tggatatttg 
gaggaaccac 
gaagagtgtt 
ttagtcattc 
atgactaagg 
gagaattcct 
gtatgtatat 
acttagttgc 
agatcatgca 
ttttaacata 
ctcttttatt 
ttagatttgg 



ctttgacaat 
agttttctct 
accttcttat 
ttagaaaaac 
acggtttctc 
atgatcagta 
gatttgcaaa 
tctgtctctt 
ggagagggat 
tagcatcttg 
catttgttaa 
actaaaccac 
taagcaacat 
aagctttgtt 
gcaggttcat 
ccaccagcct 
gcttcagctt 
acgatgaacc 
ttgttctaaa 
tcattagtga 
caaaggaaga 
atgtgccctg 
tggtaaatct 
gagtttatga 
atcccaggga 
taatctcttt 
aagtataatg 
tcctcttccc 
tgcctttctt 
ctggtttccc 
cacatccact 
aagggcctct 
gccatgaatc 
gtactggtta 
tactttctgt 
tccacttctg 
ctgtcagaaa 
atggatcccc 
tctcggtaac 
cactttggcc 
tgagcttctg 
gttacctcac 
tcactgtttt 
catttctctg 
gctatgaaca 
ccttggagtg 
caaactgatt 
cctccttctc 
tgactggtgt 
atgttaaaca 
tctttagttc 
gatttggcaa 
ctcatttaga 
ggaagaatgg 
cctctacctt 
acctatgaat 
aagtgactct 



33360 
33420 
33480 
33540 
33600 
33660 
33720 
33780 
33840 
33900 
33960 
34020 
34080 
34140 
34200 
34260 
34320 
34380 
34440 
34500 
34560 
34620 
34680 
34740 
34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
36720 
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tttgctaaat gtatctggca tgcatctatg 
cgtctgagtt ctgtggctag cttgtctctg 
atcttggaat aaaatttaga atatttcatt 
tgccatacct ttatattatt cttcctcagt 
tgttgtggct cctccgtctg gcatgcctgc 
cttcatatat tcttcacatt cgctcatatt 
aagtgtaatc tgcttttatc tggtgagaga 
acccattttc ttcttcttct ataaagctta 
tgataatgac agactcaaaa taactagcca 
atgccattct tgactcttga cacctatgtg 
accagtaatt tatcatgact agcaaataat 
gtaaaattat atctatgtct ggaaaaaatg 
actgtcttta tgtaccccag ttatgatctt 
tgacatagtg tgatagttta tcatcactgt 
ttcctctcct gctcacacca tcatcacctg 
aacatggtat atcatatctg tttgtatcat 
agggtttttt tttcagcaaa aatacataat 
tgaaacaatt gaagacaagg catggcatgg 
aggaaaccct atccagagtt tttccccagg 
tctggtggct gtggagacaa aattggcctc 
aaccttaaag aaactcaatg tggctcacaa 
tttttccaat ctgacgaacc tagtacatgt 
tactgtcaac gacttacagt ttctacgtga 
gtctttgaac ccaattgact tcattcaaga 
actgactcta agaggtaatt ttaatagctc 
ggctggttta cacgtccatc ggttgatctt 
aatttttgaa ccctctatca tggaaggact 
aacatataca aatgattttt cagatgatat 
tgcaatgtct ctggcaggtg tatctataaa 
atggcaatcc ttatcaatca ttagatgtca 
ctttcttaaa agtttgactt taactatgaa 
cctaccaagt ctcagctatc tagatcttag 
ttcttattct gatttgggaa caaacagcct 
catcattatg agtgccaatt tcatgggtct 
ctctacttta aaaagggtca cagaattctc 
ccttgacatc tcttatacta acaccaaaat 
cagtctcaac acattaaaaa tggctggcaa 
ctttgcaaac acaacaaact tgacattcct 
atcttggggg gtatttgaca ccctccatag 
tctattgttt ttggattcat cccattataa 
cagtttcaat cgcatagaga catctaaagg 
cttcttcaat cttactaaca attctgttgc 
gtgggtcaag gaacagaagc agttcttggt 
tgtagagatg aatacctcct tagtgttgga 
gacaatcatc agtgtgtcag tggtcagtgt 
ataccacttc tattttcacc tgatacttat 
aagcatctat gatgcatttg tgatctactc 
gctggtaaag aatttagaag aaggagtgcc 
ctttattcct ggtgtagcca ttgctgccaa 
gaaggttatt gtggtagtgt ctagacactt 
tgagattgct caaacatggc agtttctgag 
tgagaaggtt gagaagtccc tgctgaggca 
aaacacctac ctggaatggg aggacaatcc 
taaaaatgcc ctattggatg gaaaagcctc 
agaaacggca acttggacct gaggagaaca 
caattaataa atgctacagc tcacctgggg 
catggctgct aagctatagc atggacctta 



acaacatctg tcatgaatca ctgttggctg 36780 
tggaagcttt acgtagtaca gcttacattt 36840 
gagcttgtga gtctacacta ttcccactct 3 6900 
ttccttgttg cccttcagtc acagagactc 36960 
taactactac aacttttgga tcgctgtttt 37020 
gatcattgaa atttccactt acttattctc 37080 
gggtcaattc ttttgatgtg aatattctta 37140 
ctcatgtccc taataattaa catttacctg 37200 
tcatatatca gtaaagtttt gtaaacattt 37260 
tcattatata tgcctttaaa attaactttc 37320 
gaccacccat attgcctata ctcattagtt 373 8 0 
cataaattaa tctaagacta ctacatatca 37440 
gaattgattt tttctaatgg atttgctgcc 37500 
agcaagtgtg aaaatgacaa atctgcagag 37560 
ttttgctctg tacagttttc tctttacaat 37620 
agtatggtag ggactgttat gtcattagaa 37680 
tggtatctct tttgcccata ggtgtgaaat 37740 
cttacaccac ctctcaaact tgatactgac 37 800 
aagtttctct ggactaacaa gtttagagaa 37 860 
tctagaaagc ttccctattg gacagcttat 37920 
ttttatacat tcctgtaagt tacctgcata 37980 
ggatctttct tataactata ttcaaactat 3 8 040 
aaatccacaa gtcaatctct ctttagacat 38100 
ccaagccttt cagggaatta agctccatga 38160 
aaatataatg aaaacttgcc ttcaaaacct 38220 
gggagaattt aaagatgaaa ggaatctgga 382 80 
atgtgatgtg accattgatg agttcaggtt 38340 
tgttaagttc cattgcttgg cgaatgtttc 38400 
atatctagaa gatgttccta aacatttcaa 3 8460 
acttaagcag tttccaactc tggatctacc 38520 
caaagggtct atcagtttta aaaaagtggc 38580 
tagaaatgca ctgagcttta gtggttgctg 38640 
gagacactta gacctcagct tcaatggtgc 38700 
agaagagctg cagcacctgg attttcagca 38760 
agcgttctta tcccttgaaa agctacttta 3882 0 
tgacttcgat ggtatatttc ttggcttgac 38880 
ttctttcaaa gacaacaccc tttcaaatgt 38940 
ggatctttct aaatgtcaat tggaacaaat 39000 
acttcaatta ttaaatatga gtcacaacaa 39060 
ccagctgtat tccctcagca ctcttgattg 39120 
aatactgcaa cattttccaa agagtctagc 39180 
ttgtatatgt gaacatcaga aattcctgca 39240 
gaatgttgaa caaatgacat gtgcaacacc 39300 
ttttaataat tctacctgtt atatgtacaa 39360 
gattgtggta tccactgtag catttctgat 39420 
tgctggctgt aaaaagtaca gcagaggaga 39480 
gagtcagaat gaggactggg tgagaaatga 39540 
ccgctttcac ctctgccttc actacagaga 39600 
catcatccag gaaggcttcc acaagagccg 39660 
tattcagagc cgttggtgta tctttgaata 39720 
cagccgctct ggcatcatct tcattgtcct 397 80 
gcaggtggaa ttgtatcgcc ttcttagcag 39840 
tctggggagg cacatcttct ggagaagact 39900 
gaatcctgag caaacagcag aggaagaaca 39960 
aaactctggg gcctaaaccc agtctgtttg 4002 0 
ctctgctatg gaccgagagc ccatggaaca 40080 
ccgggcagaa ggaagtagca ctgacacctt 40140 
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cctttccagg ggtatgaatt acctaactcg 
tttaatctga aggagaagag gctaaggcct 
ctgggccttt tgaatacaag ccatgtcatg 
tgatagtttc aactgaactg aacggtttct 
ttaaatagct ctttttgaga ggtcttcatt 
cttttctttt ttttttttat ctaattctat 
agcctggcca atcctaagaa tgctatattt 
tttataaatt cagttatcgt ttttcatgcc 
tgttacaggt atgctaagaa ggcccatatt 
atatactttg tcatattgtc actgaatgtc 
tcacagagtc agtgttaaaa ataatttggt 
^gtggagagg ggtgcaggaa cagaaatcat 
aagttcacat agctgaatga caagactaca 
ggatactctc tgaaggactt gagaacattt 
ccttcatcat tctcatttct ggacatgcct 
tttctgcttt caccttattt cagtcagcat 
tgtgtgtgtg tgtgtgttgt gtatgtacat 
agagaaaata gtgtacagag cagctctata 
attcagaaaa gtatggagat ttcttggtaa 
taacaataaa aattttgaaa caatcctaca 
agacaatatt catactattg aaattatgga 
cagaatgcta tctacttgca gtggaatgtg 
tacttgtgag cagatgcata aaggtggagg 
cagaaggaac tggcctttca ggaacttttg 
tgcagaacaa tagaagagga agacaaaaga 
gattttaaac aatatgcttg aaagagaatg 
attttcataa ttccattcaa actggaacat 
aagtgaaata gcagatcaag aatctcccca 
gcagtaaatg tcttgtcaat cagccagata 
cccttccctc tcacagaaac tcctgtgcac 
aaaagcaaga agtgaaagtg ccacagttct 
aacttaaacc atgagtcatt aagaaaatac 
gtttatttat aacaactctt tccataaagg 
acaattagtt taatcaaaag aaagaaatat 
gaaagtttta actgtgtatt ccaggtttaa 
gtgacttttg gagaattcct aaactcattt 
tcatctgtct gagagctaag aaagaaatga 
aactaataaa tataaagata aaagggagac 
tatgggtagt cagagagaag cactcagtag 
tgttttccca gtctgtagga caacagcagt 
tacctagctt ctattcatct gtgttgtcct 
aaatcactag aagtcactgg aacctcatgc 
ctaggtactc gaggggtcag ttgggagact 
ttccttggtg ttgagcatca tgcctgacaa 
aagaggagaa tcattaatgc atgttttctt 
tcagaatcta aagtcctagg gatcttagct 
gtatggatga tttaaattgc ttcttcactt 
tgtggaaggt gtgtgttggg ggggggcgta 
attaaaatca tattcagcta attaatattt 
taaacattca caaaatcaca ggcttcaaga 
attaaatggg caaaactaaa tatcaaactg 
ttttgtttta tgtcaagagt tgaatgccac 
ctattgtgag aattcaaagc cagaactttg 
agacagtttt tggagttaac aatttccttg 
ctctcatatc tgatggaata aactcctttc 
aagcaaacaa ggaggctaat agaaccagct 
tctacacttg aggatttctt ctgtcacatg 



ggaaaagaaa cataatccag aatctttacc 40200 
agtgagaaca gaaaggagaa ccagtcttca 40260 
ttctgtgttt cagttgcttt agaagagtat 40320 
tactttccct tttttctact gaatgcaata 40380 
ccaatttcat cttccatttt atgtcatttt 40440 
aagaaatatg attgatacac gctcacagat 40500 
attaaataca attcctagta tacttttact 40560 
ttgactataa actaatatca taaataagat 40620 
tgactataat tttttaagaa agtatgtaaa 40680 
attcttaagt tattacctaa gttatggatg 40740 
tgatagaaat atttttaatc aggagggaaa 40800 
gatttcatca tttattcttg atttttccgg 40860 
tatgctgcaa ctgatgttcc ttctcatcaa 40920 
tggggaggaa gaaaggtcta acatcctttt 40980 
tgtgagatgg atgaatgttg ggagtacaca 41040 
gaacactgaa tatataatgt catttcacag 41100 
atatgaacct gtacatgtgt ttaagtttaa 41160 
tttgtgatag ggctttaaat agttgagcta 41220 
aggaaaccaa agtagaatca ttacaagatc 41280 
agtaaatata ttggattttc ttgtccatta 41340 
aacaaccctt ggaaggttaa tgcatagaga 41400 
atttgacctt ggagaagaag caaaccttgc 41460 
ttttttattg taagtgaaat atgccaggca 41520 
atgacatgag caaagttaga aaaaataata 41580 
aagacagccc taggatgtat tcttcacaac 41640 
aagttattag tatcaattaa gatgtctaca 41700 
agccacctaa ttatttgtct cttgttagcc 41760 
tttttctgat ataaaaaccc aaattctaat 41820 
gcacagaaga ggcaaggcga cagtctgtgc 41880 
tctagcccac tgcttcaggc tacaagctag 41940 
ctatgtggtt agtgccagtc agggtcattc 42 000 
atatgcatgc atgcattaat gcacagagta 42060 
gctggggagt tttcaacaaa atataaagga 42120 
aggcagaaga aagaaatgaa agaaagaaag 42180 
ttctagagat cttctggaat tttagagagt 42240 
tcagattata ttacgtatgt gacttggcct 42300 
agatcatgca tttattatta ggccattaca 42360 
tctgtggatg agtctccctc ttggctttct 42420 
ccttatcctt gacaacattt ttgtcacatt 42480 
ccttatgact aaagtagatt gtatcttttt 42540 
agcttccttt ttgagtctac agcctttgag 42600 
tttgacttga ggcagtcctc atatgtgttc 42660 
ggggagccat atcttaacca tcagctttgc 42720 
agtaagcaga caatgcctgt atacgtgaag 42780 
ggtgtgctgt tgtccttgat acattccagt 42840 
gtcaacttag ttttccctgt ctgtcacttt 42900 
ggttgcttga caccatgtat tctaaaattt 42960 
gttctaacaa tagtgttctc tagtggatac 43020 
gattaagttt tgcatgctat accgatttga 43 080 
tttttcttaa cacatccaaa gtacacaggc 43140 
actttattta atagtttctc tactgttctc 43200 
tgttctgtat ttttaattat ttattgtttg 43260 
aggagctgac agaggcactg tggcctatga 4332 0 
gtaactatgg actatgtctc cacacttcag 43380 
caggaggctt ctacttatgc taatgcaccc 43440 
gtttctgtct ttatagcaat ttcccaacat 43500 
atttttttca ttgggcattc tttcaatcct 43560 
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tcattaaatg gccgagactt ctcactagac 
cattgaacaa cactgacttt ttcaaagcac 
aagcaggaaa gtctgttgtt ttgatggaat 
gccttgttaa tgtaagggat tcaaagcatt 
ttttgcagat gcactgaaaa ttaagagatt 
aactattcta gttcttacta aagaaggagg 
ccttctatac ttaatgattc ttctagaaaa 
gtattatatc ttctcatggt aatttgaaag 
ctaggccttg agagtcattc tacagtacac 
attgtgggga aactcagtac agctctagat 
gataaagatc aagcctggca tgatgttaat 
aaataggaca tcccatgcat ttggccttag 
aaacttgcaa ggaggttcag tgctctgaaa 
gttcttcttt tatatatgtg tactttacat 
^g^tgctacc accttgatag agttcactta 
aatagattac ttgtcctgca tcctgtaact 
tctatctgcc aaatatttca tatgtgcatg 
tttctgatgc agctcttagc cacattgtca 
gcaaattctc caattccaag ggtagaattc 
ttttgccaca cagtagactg ctgcagtgtc 
ctaacgggag aatcctgtgg acaaggtcag 
ttcaaaggca ctgcatgtga ctttccttga 
tgacagtgct aagtgccgag tctgagtgta 
ctaatagtgt tgattataat tataaaataa 
ggaatgacaa taagggtttc gaaacaacat 
attaaacctg tgtttaggag aaaggatcct 
tgtaattaaa tcagagccag ttggttgcca 
tgtatcacct acaacaaaca agatatgact 
atcctgtttc tctatcttct cagggcccag 
ttccaaggct gggaattaac cttgtagggt 
ccgtaaggaa taggtttttc attcaagggc 
tgaggaattc aatgtttgag aggctgtctt 
ccttctatga agattaggaa tgtcttcgat 
tgcagtgtaa tttgtttgta tgtatctctc 
ccaacccgga tttgtgtttt attttcacca 
caagggacgt cataaaaata gatatatgac 
cacttgatga ttttggctat atttaattat 
tagaaatttt atttgataac tagagtttaa 
caattttaag tataaacaga tctggaaatt 
ccatgtttaa tatggccatg tgctctgtcc 
atcaacaaaa tggaacggga aacttattta 
tatttattta tttatttatt ttaaagaaaa 
ttttacactg tgtaattgat tcttctcaat 
tgtccagttt gatgtaatct tcaaacatcc 
tttctcctgg tttctacatt atatgttgcc 
ataagcttaa atcagaataa tatcatggtc 
agctttaatt tattggttta tcaaccccaa 
taaactgaat attcacttgt cactgacata 
agcaccaggc tctcctgcac agacttacca 
ttcattcctg ttgaggggag gtaaggcagt 
gacctttaaa ctaccaaccc tgaatcttct 
gcatgctgga gaaacttact acttgcaaaa 
tctctgctta atgctgtgct cagctcactg 
ttctgcagca tgggctgtgg cctgagagct 
gccactcact gaattagggt ctgaccaatt 
aaaacttcct ctgagctcaa gtatcatacc 
atttttgcag ggcatgagtt gcctgcaaat 



cccaactcaa tgaaattctt aagctgctag 4362 0 
cttgataggg aatttaagct ggaccatctg 43680 
ttcctaatgg taccattgtg gctttatttt 43740 
tcaacttact actcatagtt caagcatcta 43800 
ggagagtttg tcatatatat ttccatcatc 43860 
gtgcaaaaat ttgaaggata tgttaaagtg 4392 0 
ggcaaagtgt tgatcttgtt ctttgttatg 43980 
aagtttacat accaatttca gtttgtttac 44040 
gattaggcta ctatgaagac aaaagaaatc 44100 
ttacctttta taatagatga atcccagaat 44160 
ttagtgggct aggatcctgg aaacctccta 4422 0 
ccagtgaggc atctctgaga aagtgtagaa 44280 
gacacagagt caaatgtaca tgtaattcca 44340 
agtccctgaa gtatcgagag gctcaggtat 44400 
gccaaaatgc agaaatggat gcccagagag 44460 
taaaatgtgt taataatcat cataataaat 44520 
agactgtttt agtttaatta ttaaaattgc 44580 
tttcccatac aatgaaactg agaccaaaaa 44640 
aagtaatcct gatatccaga gctgctaatt 44700 
tgggcttttt tgctggggct cattcactca 44760 
caactccctt accatctaga aattgaaggt 44820 
tttctatgga aatgaagatg gtccctcctg 44880 
aatgtgcttt ttggcacaaa ttgttctgtt 44940 
tgtgtttctg aaaggctgca agcaattctg 45000 
ggtatttatg tgagaagtgt tttgttgaaa 45060 
gttgtttgct cctaagaaac tatcacacca 45120 
attggagttc ttgtctcaca tgaacaatat 45180 
gaccagaggt agccaagact ctttacccaa 45240 
aaaaaagatg gaaatgcatg gtcagttttt 453 00 
gaagccttcc tcaagttcat ctcagattgt 453 60 
cttttatagg aggctgtatc tgtaaataag 45420 
gacttccttt cttgggagga aaaacaaaat 45480 
gttctcagac ctcaaaggca gaaaaaagta 45540 
ttaaaataat atctaccata acattgtctc 45600 
aggacatcat aaggtttaaa gcagatcttg 45660 
aggatggtaa agtttaccag gctgaagaac 4572 0 
ataaatttct gcttttatta tctctcttgc 45780 
taatctgtat ttttaaaaat attctatgtg 45840 
actatttaag aggcaacagc ctataatgta 45900 
ttgagattta ctgctgagag ccaaagaaag 45960 
tttatttatt tatttattta tttatttatt 46020 
aggtgcttca tttatctgat gattttattc 46080 
tctatctgat cagactcatg tggaagaatc 46140 
acatagaagt tataatctga cagtcatgtg 46200 
ttcttcatcc ccttttggaa tttgagatac 462 60 
tgtcatgaac tctctgaggc atctgttgac 46320 
acataccaag tctaacttac ctcccatttg 463 80 
cacagctgca acaaatggcc ttctctgtaa 46440 
cataattgtc agtcttccca ggaaaccctt 46500 
gagcactaat agcttaaatt cagtcatttt 46560 
ggaggagtct atggctcccc agtgggaaac 46620 
agcacttttg aaataagctg tggggatgaa 46680 
cagggtcctg cggagtcttt actcttcatc 46740 
gcactgctaa gtgtagggag cctcctttct 46800 
gtgtcattca gggtgcagac tagccactag 46860 
ccgagaacgg cacagagagg taggaccatt 46920 
tagatgggtg tattttttta tggttaatgt 46980 
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gctggttatt tttacttatc atgattgatg agtggtaaac aatgacctct ataaaaatac 47 040 
atgtgtgttt agaatatgag tttattagag ggaaaaaaca aaatttagca gagagatgca 47100 
gatgtggaga gagacaggag aaagggctag agatggatat cagcagttgg gggcagaggt 47160 
gtgcatctct ataatgtgcc agagacctgg tgtggagatg cttccaggag tctatggggg 47220 
tgtctttaac ttcagctaag agatcctagc actggcagat acagagcttg aagtggcaac 47280 
ctcctttata gccaactaag atccctcagt ggagggataa ggacaacaac ccactcacaa 47340 
aacttttgac ccaaaatctg tcctgtctgc aagaagggac agaaatggaa ccgagattga 47400 
gggcatggcc aatcaatgac tatcccaact tgagactcat ccctctagac tgaaacacaa 47460 
agaaaagggc aaacatgggc agaaatttgg accctgaact tatgtagcat atgtacagct 47520 
tggtattcat gtgtggattc ctcaacaact gcagcagggg ctgtccctga atctgttgcc 47580 
tgcttgtgga tcctgttccc ctaactaagt tgccttgtct ggtctcagtg agagagggat 47640 
gaaactcttc ctgcagtgac ttgatatgtc aaggtcaagt gatacccagg ggctgggagt 47700 
cttcccattc tcagaggaaa aggggaagag gcgtggggaa gggactgtgt gagggggcac 47760 
tgggaagagg gatgctgaga ttggggtgta aggtgaacaa gtaagtaaat taatggaaaa 4782 0 
aaggaagtta tcaccagtgc aattcccaaa gggaaagaag caaacccctg tcagatgatg 47880 
ggctgaagtt ccggttatcc ttcttgcatg cttacctctg caaaacagtc tccacatctg 47940 
taaaactcca aagatgaagt aaatgtccat ctccacaatt ctattctgta attagaacag 48000 
taaccctacc atgcaactct tttgctctcc tggactgtgg ttctaacatt tgtgacctca 48060 
ttatagcata caaagactag aagcatcttt catcaattaa taagcactca agcattagta 48120 
atttttcact ttttcctcag ttccagaaaa ggattgagct aagatcagtt gagtggttaa 48180 
acaaagtact attgaaggca ggaaggatgg ctggttaact gctgcaacca gtgatatcat 48240 
aatataaagg ccagttcctg gatgtttgga ttcactgttt acaatgtaaa agtatatgta 48300 
cagctatagg tatgatagct ttgagagtca agtaagactg gggattcaag aaaattcaac 48360 
agagtgcaat tgaaatacca taaatgatat gtatctcttt tgccaaatca tataaccccc 48420 
aaaacacctt ccatcatgca tatgcattaa gaagcttgta aattaatcat ctgcaccatt 48480 
ttcacaagat tatcttggag tttagcagtg tttttttttt atacttggcc actttgaata 48540 
atcttaagga gagaaataca gtttgtctaa atccaagcac gtcttgaact aatgcttaca 48600 
attatccttg tttcccacat ttgacattta aagtgatata tcataggttc ctacattgct 48660 
agctgtggaa gcgccatctg accccttgtg cctctcacca tctgtgaatt cttgtcagct 4872 0 
cagagtaaac tctgcataaa tttcaccatt gaagattagt gatagaagag aactctattc 48780 
gctctttctt ctggctttat tttttatttt taatgctgtc tgattgccca aggtatgtat 48840 
ggagggtgta cacagacggt acacagacct aagtcaggtg tctaagcatc ccaggaactt 48900 
cccttccaat attcttttct gagcatatgc cctcagttag ttttcctctt catatgatct 489 60 
gtgctcctgt ttataccaaa ctctcggctc tggcagcatc ctcgtccaaa aagcacaagt 49020 
tcagttaagt tcactggtca cataccacca ccattttcta ctctttatac tttctttccc 49080 
tgattacatt ccaatagtgt gtaggcatga acacatgtgc acacatacac acatgtgcag 49140 
attatagtcc acttgtagca ataagaggat tctcagtaca attcgtggga gttggatttc 49200 
tcctgccccc acataggtac aattaatccc agtactcggg aggcaaaggc aggcagattc 49260 
ctgagttcaa ggccagcctg gtttaaaaag tgagttccag gacagccaaa gctacccaga 49320 
aaaaccccgt ttcaaaaaac caaataaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 493 80 
aaaaaaaaaa ggatcgaatt ctaattatca gccaaggtag ggaatacctt tatcttttgt 49440 
gacatatgtg gaccatactt taagtttttg tgggtactaa cttcattctt gttttatttt 49500 
tctctgtctc tctgaattct ctttctcttt cctattaccc ttatgcccaa agcatgagaa 49560 
ttccaacttc catatttgtg tttattcttt ctttgcactt ttcctctctt tctgttttgt 49620 
aactctataa ccctttttgt ttgcttgttt ttgcatggga tagttattat gcattctatc 49680 
tcactatgtt agaaaaaata gtttcagctc tgggaattga gcagttctgt gctgatttca 49740 
tgtctaacac tatatgcttt tttttcctct ccttcaaata gaggtaatag atacctttca 49800 
gtatctatta gcagaggagt ttgcagacat atacaaagtt catttttctc ctaggaagtt 49860 
ttcttttctt tgcttttcat gccatctaac atttgtagga aagctgcttt ctgctaccac 49920 
aatacaagat gcatgaaggg gcggagctaa gtgtcaaaat catgctccca aagttttata 49980 
cattttaggt tattttcaga 50000 

<210> 49 

<211> 25 

<212> DNA 

<213> Mus musculus 
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<400> 49 

cagtcggtca gcaaacgcct tcttc 



25 



<210> 50 

<211> 25 

<212> DNA 

<213> Mus mus cuius 

<400> 50 

caaggcaggc tagcaggaaa gggtg 25 

<210> 51 

<211> 24 

<212> DNA 

<213> Mus musculus 

<400> 51 

ttattcatct ttggagagga gtgg 24 

<210> 52 

<211> 26 

<212> DNA 

<213> Mus musculus 

<400> 52 

aaggaagttt agttagaacc accttg 26 

<210> 53 

<211> 26 

<212> DNA 

<213> Mus musculus 

<400> 53 

tctcctgctc acaccatcat cacctg 26 

<210> 54 

<211> 24 

<212> DNA 

<213> Mus musculus 

<400> 54 

catctgttcc atgggctctc ggtc 24 

<210> 55 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 55 

gctcggtaaa cggtgatag 19 

<210> 56 

<211> 20 

<212> DNA 

<213> Homo sapiens 
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<400> 56 

tgagaagttc tgggcagaag 



<210> 57 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 57 

tctctggtct aggagagg 

<210> 58 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 58 

ccagtccaat aatgaaatg 

<210> 59 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 59 

ccatcacatc tgtatgaaga gctggatgac 

<210> 60 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 60 

tgactttctt tgtcatgggt tccttgactg 

<210> 61 

<211> 18 

<212> DNA 

<213> Mus musculus 

<400> 61 

atgccatgcc ttgtcttc 

<210> 62 

<211> 16 

<212> DNA 

<213> Mus musculus 

<400> 62 

tttaaattct cccaag 

<210> 63 

<211> 15 

<212> DNA 

<213> Mus musculus 



<400> 63 

cagctcttct agacc 



<210> 64 

<211> 20 

<212> DNA 

<213> Mus musculus 

<400> 64 

tgtgaacatc agaaattcct 

<210> 65 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 65 

tgagattgct caaacatgg 

<210> 66 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 66 

ttgaaacaat tgaagacaag gc 

<210> 67 

<211> 19 

<212> DNA 

<213> Mus musculus 

<400> 67 

cctggctggt ttacacgtc 

<210> 68 

<211> 22 

<212> DNA 

<213> Mus musculus 

<400> 68 

tttcatgggt ctagaagagc tg 

<210> 69 

<211> 18 

<212> DNA 

<213> Mus musculus 

<400> 69 

aagaactgct tctgttcc 

<210> 70 

<211> 19 

<212> DNA 

<213> Mus musculus 



<400> 70 

tcagaaactg ccatgtttg 



<210> 71 
<211> 20 
<212> DNA 

<213> Mus mus cuius 
<400> 71 

tgagctggta aagaatttag 

<210> 72 

<211> 21 

<212> DNA 

<213> Mus musculus 

<400> 72 

ctgacgaacc tagtacatgt g 

<210> 73 
<211> 19 
<212> DNA 

<213> Mus musculus 
<400> 73 

atgtcaagtt tgttgtgtt 

<210> 74 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 74 

gagctggatg actaggatta atattc 

<210> 75 
<211> 22 
<212> DNA 

<213> Homo sapiens 
<400> 75 

tcaaattgca caggccctct ag 

<210> 76 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 76 

caatctctct ttagacctgt cc 

<210> 77 

<211> 22 

<212> DNA 

<213> Homo sapiens 



<400> 77 

aatactttag gctggttgtc cc 



22 



<210> 78 

<211> 22 

<212> DNA 

<213> Homo sapiens 

<400> 78 

gaagttgatc taccaagcct tg 22 

<210> 79 

<211> 23 

<212> DNA 

<213> Homo sapiens 



<210> 80 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 80 

cttcctggac ctctctcagt gtcaac 2 6 

<210> 81 

<211> 22 

<212> DNA 

<213> Homo sapiens 



<210> 82 

<211> 26 

<212> DNA 

<213> Homo sapiens 

<400> 82 

tcagatgaat aagaccatca ttggtg 2 6 

<210> 83 

<211> 18 

<212> DNA 

<213> Homo sapiens 



<400> 79 

ggaagtcatt atgtgattga gac 
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<400> 81 

gaaggcagag ctgaaatgga gg 
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<400> 83 

aacaagtgtt ggacccag 



18 



<210> 84 
<211> 19 
<212> DNA 



<213> Homo sapiens 
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<400> 84 
gtaaatttgg 

<210> 85 
<211> 21 
<212> DNA 
<213> Homo 

<400> 85 
ttcagtattc 

<210> 86 

<211> 20 

<212> DNA 

<213> Homo 

<400> 86 
ttataagtgt 

<210> 87 
<211> 19 
<212> DNA 
<213> Homo 

<400> 87 
tcggtcctca 

<210> 88 
<211> 18 
<212> DNA 
<213> Homo 

<400> 88 
gtgtcccagc 

<210> 89 
<211> 18 
<212> DNA 
<213> Homo 

<400> 89 
aacctcctga 

<210> 90 
<211> 19 
<212> DNA 
<213> Homo 

<400> 90 
gtttcaaatt 

<210> 91 
<211> 18 
<212> DNA 
<213> Homo 



acagtttcc 

sapiens 
ctatcactca 

sapiens 
ctgaactccc 

sapiens 
gtgtgcttg 

sapiens 
acttcatc 

sapiens 
ggcatttc 

sapiens 
ggaatgctg 

sapiens 



<400> 91 

aaggaaacgt atccaatg 



<210> 92 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 92 

aagcacactg aggaccgac 

<210> 93 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 93 

gatgaagtgc tgggacac 

<210> 94 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 94 

tcctcttcag atagatgttg 

<210> 95 

<211> 18 

<212> DNA 

<213> Homo sapiens 

<400> 95 

tttctttgtc atgggttc 

<210> 96 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 96 

tttaggttct tattcagcag 

<210> 97 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 97 

gctctagatt ggtcagatta g 

<210> 98 

<211> 839 

<212> PRT 

<213> Homo sapiens 



<400> 98 

Met Met Ser Ala Ser Arg Leu Ala Gly Thr Leu lie Pro Ala Met Ala 
15 10 15 

Phe Leu Ser Cys Val Arg Pro Glu Ser Trp Glu Pro Cys Val Glu Val 
20 25 30 

Val Pro Asn lie Thr Tyr Gin Cys Met Glu Leu Asn Phe Tyr Lys lie 
35 40 45 

Pro Asp Asn Leu Pro Phe Ser Thr Lys Asn Leu Asp Leu Ser Phe Asn 
50 55 60 

Pro Leu Arg His Leu Gly Ser Tyr Ser Phe Phe Ser Phe Pro Glu Leu 
65 70 75 80 

Gin Val Leu Asp Leu Ser Arg Cys Glu lie Gin Thr lie Glu Asp Gly 
85 90 95 

Ala Tyr Gin Ser Leu Ser His Leu Ser Thr Leu lie Leu Thr Gly Asn 
100 105 110 

Pro lie Gin Ser Leu Ala Leu Gly Ala Phe Ser Gly Leu Ser Ser Leu 
115 120 125 

Gin Lys Leu Val Ala Val Glu Thr Asn Leu Ala Ser Leu Glu Asn Phe 
130 135 140 

Pro lie Gly His Leu Lys Thr Leu Lys Glu Leu Asn Val Ala His Asn 
145 150 155 160 

Leu lie Gin Ser Phe Lys Leu Pro Glu Tyr Phe Ser Asn Leu Thr Asn 
165 170 175 

Leu Glu His Leu Asp Leu Ser Ser Asn Lys lie Gin Ser lie Tyr Cys 
180 185 190 

Thr Asp Leu Arg Val Leu His Gin Met Pro Leu Leu Asn Leu Ser Leu 
195 200 205 

Asp Leu Ser Leu Asn Pro Met Asn Phe lie Gin Pro Gly Ala Phe Lys 
210 215 220 

Glu lie Arg Leu His Lys Leu Thr Leu Arg Asn Asn Phe Asp Ser Leu 
225 230 235 240 

Asn Val Met Lys Thr Cys lie Gin Gly Leu Ala Gly Leu Glu Val His 
245 250 255 

Arg Leu Val Leu Gly Glu Phe Arg Asn Glu Gly Asn Leu Glu Lys Phe 
260 265 270 

Asp Lys Ser Ala Leu Glu Gly Leu Cys Asn Leu Thr lie Glu Glu Phe 
275 280 285 

Arg Leu Ala Tyr Leu Asp Tyr Tyr Leu Asp Asp lie lie Asp Leu Phe 
290 295 300 
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Asn Cys Leu Thr Asn Val Ser Ser Phe Ser Leu Val Ser Val Thr lie 
305 310 315 320 



Glu Arg Val Lys Asp Phe Ser Tyr Asn Phe Gly Trp Gin His Leu Glu 
325 330 335 

Leu Val Asn Cys Lys Phe Gly Gin Phe Pro Thr Leu Lys Leu Lys Ser 
340 345 350 

Leu Lys Arg Leu Thr Phe Thr Ser Asn Lys Gly Gly Asn Ala Phe Ser 
355 360 365 

Glu Val Asp Leu Pro Ser Leu Glu Phe Leu Asp Leu Ser Arg Asn Gly 
370 375 380 

Leu Ser Phe Lys Gly Cys Cys Ser Gin Ser Asp Phe Gly Thr Thr Ser 
385 390 395 400 

Leu Lys Tyr Leu Asp Leu Ser Phe Asn Gly Val lie Thr Met Ser Ser 
405 410 415 

Asn Phe Leu Gly Leu Glu Gin Leu Glu His Leu Asp Phe Gin His Ser 
420 425 430 

Asn Leu Lys Gin Met Ser Glu Phe Ser Val Phe Leu Ser Leu Arg Asn 
435 440 445 

Leu lie Tyr Leu Asp lie Ser His Thr His Thr Arg Val Ala Phe Asn 
450 455 460 

Gly lie Phe Asn Gly Leu Ser Ser Leu Glu Val Leu Lys Met Ala Gly 
465 470 475 480 

Asn Ser Phe Gin Glu Asn Phe Leu Pro Asp lie Phe Thr Glu Leu Arg 
485 490 495 

Asn Leu Thr Phe Leu Asp Leu Ser Gin Cys Gin Leu Glu Gin Leu Ser 
500 505 510 

Pro Thr Ala Phe Asn Ser Leu Ser Ser Leu Gin Val Leu Asn Met Ser 
515 520 525 

His Asn Asn Phe Phe Ser Leu Asp Thr Phe Pro Tyr Lys Cys Leu Asn 
530 535 540 

Ser Leu Gin Val Leu Asp Tyr Ser Leu Asn His lie Met Thr Ser Lys 
545 550 555 560 

Lys Gin Glu Leu Gin His Phe Pro Ser Ser Leu Ala Phe Leu Asn Leu 
565 570 575 

Thr Gin Asn Asp Phe Ala Cys Thr Cys Glu His Gin Ser Phe Leu Gin 
580 585 590 

Trp lie Lys Asp Gin Arg Gin Leu Leu Val Glu Val Glu Arg Met Glu 
595 600 605 
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Cys Ala Thr Pro Ser Asp Lys Gin Gly Met Pro Val Leu Ser Leu Asn 
610 615 620 

lie Thr Cys Gin Met Asn Lys Thr lie He Gly Val Ser Val Leu Ser 
625 630 635 640 

Val Leu Val Val Ser Val Val Ala Val Leu Val Tyr Lys Phe Tyr Phe 
645 650 655 

His Leu Met Leu Leu Ala Gly Cys lie Lys Tyr Gly Arg Gly Glu Asn 
660 665 670 

He Tyr Asp Ala Phe Val He Tyr Ser Ser Gin Asp Glu Asp Trp Val 
675 680 685 

Arg Asn Glu Leu Val Lys Asn Leu Glu Glu Gly Val Pro Pro Phe Gin 
690 695 700 

Leu Cys Leu His Tyr Arg Asp Phe He Pro Gly Val Ala He Ala Ala 
705 710 715 720 

Asn He He His Glu Gly Phe His Lys Ser Arg Lys Val He Val Val 
725 730 735 

Val Ser Gin His Phe He Gin Ser Arg Trp Cys He Phe Glu Tyr Glu 
740 745 750 

He Ala Gin Thr Trp Gin Phe Leu Ser Ser Arg Ala Gly He He Phe 
755 760 765 

He Val Leu Gin Lys Val Glu Lys Thr Leu Leu Arg Gin Gin Val Glu 
770 775 780 

Leu Tyr Arg Leu Leu Ser Arg Asn Thr Tyr Leu Glu Trp Glu Asp Ser 
785 790 795 800 

Val Leu Gly Arg His He Phe Trp Arg Arg Leu Arg Lys Ala Leu Leu 
805 810 815 

Asp Gly Lys Ser Trp Asn Pro Glu Gly Thr Val Gly Thr Gly Cys Asn 
820 825 830 

Trp Gin Glu Ala Thr Ser He 
835 



<210> 99 

<211> 835 

<212> PRT 

<213> Mus musculus 

<400> 99 

Met Met Pro Pro Trp Leu Leu Ala Arg Thr Leu He Met Ala Leu Phe 
15 10 15 



51 



Phe Ser Cys Leu Thr Pro Gly Ser Leu Asn Pro Cys He Glu Val Val 
20 25 30 



Pro Asn He Thr Tyr Gin Cys Met Asp Gin Lys Leu Ser Lys Val Pro 
35 40 45 

Asp Asp lie Pro Ser Ser Thr Lys Asn He Asp Leu Ser Phe Asn Pro 
50 55 60 

Leu Lys lie Leu Lys Ser Tyr Ser Phe Ser Asn Phe Ser Glu Leu Gin 
65 70 75 80 

Trp Leu Asp Leu Ser Arg Cys Glu He Glu Thr He Glu Asp Lys Ala 
85 90 95 

Trp His Gly Leu His His Leu Ser Asn Leu He Leu Thr Gly Asn Pro 
100 105 110 

He Gin Ser Phe Ser Pro Gly Ser Phe Ser Gly Leu Thr Ser Leu Glu 
115 120 125 

Asn Leu Val Ala Val Glu Thr Lys Leu Ala Ser Leu Glu Ser Phe Pro 
130 135 140 

He Gly Gin Leu He Thr Leu Lys Lys Leu Asn Val Ala His Asn Phe 
145 150 155 160 

He His Ser Cys Lys Leu Pro Ala Tyr Phe Ser Asn Leu Thr Asn Leu 
165 170 175 

Val His Val Asp Leu Ser Tyr Asn Tyr He Gin Thr He Thr Val Asn 
180 185 190 

Asp Leu Gin Phe Leu Arg Glu Asn Pro Gin Val Asn Leu Ser Leu Asp 
195 200 205 

Met Ser Leu Asn Pro He Asp Phe He Gin Asp Gin Ala Phe Gin Gly 
210 215 220 

He Lys Leu His Glu Leu Thr Leu Arg Gly Asn Phe Asn Ser Ser Asn 
225 230 235 240 

He Met Lys Thr Cys Leu Gin Asn Leu Ala Gly Leu His Val His Arg 
245 250 255 

Leu He Leu Gly Glu Phe Lys Asp Glu Arg Asn Leu Glu He Phe Glu 
260 265 270 

Pro Ser He Met Glu Gly Leu Cys Asp Val Thr He Asp Glu Phe Arg 
275 280 285 

Leu Thr Tyr Thr Asn Asp Phe Ser Asp Asp He Val Lys Phe His Cys 
290 295 300 

Leu Ala Asn Val Ser Ala Met Ser Leu Ala Gly Val Ser He Lys Tyr 
305 310 315 320 
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Leu Glu Asp Val Pro Lys His Phe Lys Trp Gin Ser Leu Ser lie lie 
325 330 335 

Arg Cys Gin Leu Lys Gin Phe Pro Thr Leu Asp Leu Pro Phe Leu Lys 
340 345 350 

Ser Leu Thr Leu Thr Met Asn Lys Gly Ser lie Ser Phe Lys Lys Val 
355 360 365 

Ala Leu Pro Ser Leu Ser Tyr Leu Asp Leu Ser Arg Asn Ala Leu Ser 
370 375 380 

Phe Ser Gly Cys Cys Ser Tyr Ser Asp Leu Gly Thr Asn Ser Leu Arg 
385 390 395 400 

His Leu Asp Leu Ser Phe Asn Gly Ala lie lie Met Ser Ala Asn Phe 
405 410 415 

Met Gly Leu Glu Glu Leu Gin His Leu Asp Phe Gin His Ser Thr Leu 
420 425 430 

Lys Arg Val Thr Glu Phe Ser Ala Phe Leu Ser Leu Glu Lys Leu Leu 
435 440 445 

Tyr Leu Asp lie Ser Tyr Thr Asn Thr Lys lie Asp Phe Asp Gly lie 
450 455 460 

Phe Leu Gly Leu Thr Ser Leu Asn Thr Leu Lys Met Ala Gly Asn Ser 
465 470 475 480 

Phe Lys Asp Asn Thr Leu Ser Asn Val Phe Ala Asn Thr Thr Asn Leu 
485 490 495 

Thr Phe Leu Asp Leu Ser Lys Cys Gin Leu Glu Gin lie Ser Trp Gly 
500 505 510 

Val Phe Asp Thr Leu His Arg Leu Gin Leu Leu Asn Met Ser His Asn 
515 520 525 

Asn Leu Leu Phe Leu Asp Ser Ser His Tyr Asn Gin Leu Tyr Ser Leu 
530 535 540 

Ser Thr Leu Asp Cys Ser Phe Asn Arg lie Glu Thr Ser Lys Gly lie 
545 550 555 560 

Leu Gin His Phe Pro Lys Ser Leu Ala Phe Phe Asn Leu Thr Asn Asn 
565 570 575 

Ser Val Ala Cys lie Cys Glu His Gin Lys Phe Leu Gin Trp Val Lys 
580 585 590 

Glu Gin Lys Gin Phe Leu Val Asn Val Glu Gin Met Thr Cys Ala Thr 
595 600 605 

Pro Val Glu Met Asn Thr Ser Leu Val Leu Asp Phe Asn Asn Ser Thr 
610 615 620 



53 



Cys Tyr Met Tyr Lys Thr He He Ser Val Ser Val Val Ser Val He 
625 630 635 640 



Val Val Ser Thr Val Ala Phe Leu He Tyr His Phe Tyr Phe His Leu 
645 650 655 

He Leu He Ala Gly Cys Lys Lys Tyr Ser Arg Gly Glu Ser He Tyr 
660 665 670 

Asp Ala Phe Val He Tyr Ser Ser Gin Asn Glu Asp Trp Val Arg Asn 
675 680 685 

Glu Leu Val Lys Asn Leu Glu Glu Gly Val Pro Arg Phe His Leu Cys 
690 695 700 

Leu His Tyr Arg Asp Phe He Pro Gly Val Ala He Ala Ala Asn He 
705 710 715 720 

He Gin Glu Gly Phe His Lys Ser Arg Lys Val He Val Val Val Ser 
725 730 735 

Arg His Phe He Gin Ser Arg Trp Cys He Phe Glu Tyr Glu He Ala 
740 745 750 

Gin Thr Trp Gin Phe Leu Ser Ser Arg Ser Gly He He Phe He Val 
755 760 765 

Leu Glu Lys Val Glu Lys Ser Leu Leu Arg Gin Gin Val Glu Leu Tyr 
770 775 780 

Arg Leu Leu Ser Arg Asn Thr Tyr Leu Glu Trp Glu Asp Asn Pro Leu 
785 790 795 800 

Gly Arg His He Phe Trp Arg Arg Leu Lys Asn Ala Leu Leu Asp Gly 
805 810 815 

Lys Ala Ser Asn Pro Glu Gin Thr Ala Glu Glu Glu Gin Glu Thr Ala 
820 825 830 

Thr Trp Thr 
835 
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